Search Results: "shell"

16 September 2023

Sergio Talens-Oliag: GitLab CI/CD Tips: Using a Common CI Repository with Assets

This post describes how to handle files that are used as assets by jobs and pipelines defined on a common gitlab-ci repository when we include those definitions from a different project.

Problem descriptionWhen a .giltlab-ci.yml file includes files from a different repository its contents are expanded and the resulting code is the same as the one generated when the included files are local to the repository. In fact, even when the remote files include other files everything works right, as they are also expanded (see the description of how included files are merged for a complete explanation), allowing us to organise the common repository as we want. As an example, suppose that we have the following script on the assets/ folder of the common repository:

dumb.sh

#!/bin/sh
echo "The script arguments are: '$@'"

If we run the following job on the common repository:

job:
  script:
    - $CI_PROJECT_DIR/assets/dumb.sh ARG1 ARG2

the output will be:

The script arguments are: 'ARG1 ARG2'

But if we run the same job from a different project that includes the same job definition the output will be different:

/scripts-23-19051/step_script: eval: line 138: d./assets/dumb.sh: not found

The problem here is that we include and expand the YAML files, but if a script wants to use other files from the common repository as an asset (configuration file, shell script, template, etc.), the execution fails if the files are not available on the project that includes the remote job definition.

SolutionsWe can solve the issue using multiple approaches, I ll describe two of them:

Create files using scripts
Download files from the common repository

Create files using scriptsOne way to dodge the issue is to generate the non YAML files from scripts included on the pipelines using HERE documents. The problem with this approach is that we have to put the content of the files inside a script on a YAML file and if it uses characters that can be replaced by the shell (remember, we are using HERE documents) we have to escape them (error prone) or encode the whole file into base64 or something similar, making maintenance harder. As an example, imagine that we want to use the dumb.sh script presented on the previous section and we want to call it from the same PATH of the main project (on the examples we are using the same folder, in practice we can create a hidden folder inside the project directory or use a PATH like /tmp/assets-$CI_JOB_ID to leave things outside the project folder and make sure that there will be no collisions if two jobs are executed on the same place (i.e. when using a ssh runner). To create the file we will use hidden jobs to write our script template and reference tags to add it to the scripts when we want to use them. Here we have a snippet that creates the file with cat:

.file_scripts:
  create_dumb_sh:
    -  
      # Create dumb.sh script
      mkdir -p "$ CI_PROJECT_DIR /assets"
      cat >"$ CI_PROJECT_DIR /assets/dumb.sh" <<EOF
      #!/bin/sh
      echo "The script arguments are: '\$@'"
      EOF
      chmod +x "$ CI_PROJECT_DIR /assets/dumb.sh"

Note that to make things work we ve added 6 spaces before the script code and escaped the dollar sign. To do the same using base64 we replace the previous snippet by this:

.file_scripts:
  create_dumb_sh:
    -  
      # Create dumb.sh script
      mkdir -p "$ CI_PROJECT_DIR /assets"
      base64 -d >"$ CI_PROJECT_DIR /assets/dumb.sh" <<EOF
      IyEvYmluL3NoCmVjaG8gIlRoZSBzY3JpcHQgYXJndW1lbnRzIGFyZTogJyRAJyIK
      EOF
      chmod +x "$ CI_PROJECT_DIR /assets/dumb.sh"

Again, we have to indent the base64 version of the file using 6 spaces (all lines of the base64 output have to be indented) and to make changes we have to decode and re-code the file manually, making it harder to maintain. With either version we just need to add a !reference before using the script, if we add the call on the first lines of the before_script we can use the downloaded file in the before_script, script or after_script sections of the job without problems:

job:
  before_script:
    - !reference [.file_scripts, create_dumb_sh]
  script:
    - $ CI_PROJECT_DIR /assets/dumb.sh ARG1 ARG2

The output of a pipeline that uses this job will be the same as the one shown in the original example:

The script arguments are: 'ARG1 ARG2'

Download the files from the common repositoryAs we ve seen the previous solution works but is not ideal as it makes the files harder to read, maintain and use. An alternative approach is to keep the assets on a directory of the common repository (in our examples we will name it assets) and prepare a YAML file that declares some variables (i.e. the URL of the templates project and the PATH where we want to download the files) and defines a script fragment to download the complete folder. Once we have the YAML file we just need to include it and add a reference to the script fragment at the beginning of the before_script of the jobs that use files from the assets directory and they will be available when needed. The following file is an example of the YAML file we just mentioned:

bootstrap.yml

variables:
  CI_TMPL_API_V4_URL: "$ CI_API_V4_URL /projects/common%2Fci-templates"
  CI_TMPL_ARCHIVE_URL: "$ CI_TMPL_API_V4_URL /repository/archive"
  CI_TMPL_ASSETS_DIR: "/tmp/assets-$ CI_JOB_ID "
.scripts_common:
  bootstrap_ci_templates:
    -  
      # Downloading assets
      echo "Downloading assets"
      mkdir -p "$CI_TMPL_ASSETS_DIR"
      wget -q -O - --header="PRIVATE-TOKEN: $CI_TMPL_READ_TOKEN" \
        "$CI_TMPL_ARCHIVE_URL?path=assets&sha=$ CI_TMPL_REF:-main "  
        tar --strip-components 2 -C "$ASSETS_DIR" -xzf -

The file defines the following variables:

CI_TMPL_API_V4_URL: URL of the common project, in our case we are using the project ci-templates inside the common group (note that the slash between the group and the project is escaped, that is needed to reference the project by name, if we don t like that approach we can replace the url encoded path by the project id, i.e. we could use a value like $ CI_API_V4_URL /projects/31)
CI_TMPL_ARCHIVE_URL: Base URL to use the gitlab API to download files from a repository, we will add the arguments path and sha to select which sub path to download and from which commit, branch or tag (we will explain later why we use the CI_TMPL_REF, for now just keep in mind that if it is not defined we will download the version of the files available on the main branch when the job is executed).
CI_TMPL_ASSETS_DIR: Destination of the downloaded files.

And uses variables defined in other places:

CI_TMPL_READ_TOKEN: token that includes the read_api scope for the common project, we need it because the tokens created by the CI/CD pipelines of other projects can t be used to access the api of the common one.We define the variable on the gitlab CI/CD variables section to be able to change it if needed (i.e. if it expires)
CI_TMPL_REF: branch or tag of the common repo from which to get the files (we need that to make sure we are using the right version of the files, i.e. when testing we will use a branch and on production pipelines we can use fixed tags to make sure that the assets don t change between executions unless we change the reference).We will set the value on the .gitlab-ci.yml file of the remote projects and will use the same reference when including the files to make sure that everything is coherent.

This is an example YAML file that defines a pipeline with a job that uses the script from the common repository:

pipeline.yml

include:
  - /bootstrap.yaml
stages:
  - test
dumb_job:
  stage: test
  before_script:
    - !reference [.bootstrap_ci_templates, create_dumb_sh]
  script:
    - $ CI_TMPL_ASSETS_DIR /dumb.sh ARG1 ARG2

To use it from an external project we will use the following gitlab ci configuration:

gitlab-ci.yml

include:
  - project: 'common/ci-templates'
    ref: &ciTmplRef 'main'
    file: '/pipeline.yml'
variables:
  CI_TMPL_REF: *ciTmplRef

Where we use a YAML anchor to ensure that we use the same reference when including and when assigning the value to the CI_TMPL_REF variable (as far as I know we have to pass the ref value explicitly to know which reference was used when including the file, the anchor allows us to make sure that the value is always the same in both places). The reference we use is quite important for the reproducibility of the jobs, if we don t use fixed tags or commit hashes as references each time a job that downloads the files is executed we can get different versions of them. For that reason is not a bad idea to create tags on our common repo and use them as reference on the projects or branches that we want to behave as if their CI/CD configuration was local (if we point to a fixed version of the common repo the way everything is going to work is almost the same as having the pipelines directly in our repo). But while developing pipelines using branches as references is a really useful option; it allows us to re-run the jobs that we want to test and they will download the latest versions of the asset files on the branch, speeding up the testing process. However keep in mind that the trick only works with the asset files, if we change a job or a pipeline on the YAML files restarting the job is not enough to test the new version as the restart uses the same job created with the current pipeline. To try the updated jobs we have to create a new pipeline using a new action against the repository or executing the pipeline manually.

ConclusionFor now I m using the second solution and as it is working well my guess is that I ll keep using that approach unless giltab itself provides a better or simpler way of doing the same thing.

12 September 2023

John Goerzen: A Maze of Twisty Little Pixels, All Tiny

Two years ago, I wrote Managing an External Display on Linux Shouldn t Be This Hard. Happily, since I wrote that post, most of those issues have been resolved. But then you throw HiDPI into the mix and it all goes wonky. If you re running X11, basically the story is that you can change the scale factor, but it only takes effect on newly-launched applications (which means a logout/in because some of your applications you can t really re-launch). That is a problem if, like me, you sometimes connect an external display that is HiDPI, sometimes not, or your internal display is HiDPI but others aren t. Wayland is far better, supporting on-the-fly resizes quite nicely. I ve had two devices with HiDPI displays: a Surface Go 2, and a work-issued Thinkpad. The Surface Go 2 is my ultraportable Linux tablet. I use it sparingly at home, and rarely with an external display. I just put Gnome on it, in part because Gnome had better on-screen keyboard support at the time, and left it at that. On the work-issued Thinkpad, I really wanted to run KDE thanks to its tiling support (I wound up using bismuth with it). KDE was buggy with Wayland at the time, so I just stuck with X11 and ran my HiDPI displays at lower resolutions and lived with the fuzziness. But now that I have a Framework laptop with a HiDPI screen, I wanted to get this right. I tried both Gnome and KDE. Here are my observations with both: Gnome I used PaperWM with Gnome. PaperWM is a tiling manager with a unique horizontal ribbon approach. It grew on me; I think I would be equally at home, or maybe even prefer it, to my usual xmonad-style approach. Editing the active window border color required editing ~/.local/share/gnome-shell/extensions/paperwm@hedning:matrix.org/stylesheet.css and inserting background-color and border-color items in the paperwm-selection section. Gnome continues to have an absolutely terrible picture for configuring things. It has no less than four places to make changes (Settings, Tweaks, Extensions, and dconf-editor). In many cases, configuration for a given thing is split between Settings and Tweaks, and sometimes even with Extensions, and then there are sometimes options that are only visible in dconf. That is, where the Gnome people have even allowed something to be configurable. Gnome installs a power manager by default. It offers three options: performance, balanced, and saver. There is no explanation of the difference between them. None. What is it setting when I change the pref? A maximum frequency? A scaling governor? A balance between performance and efficiency cores? Not only that, but there s no way to tell it to just use performance when plugged in and balanced or saver when on battery. In an issue about adding that, a Gnome dev wrote We re not going to add a preference just because you want one . KDE, on the other hand, aside from not mucking with your system s power settings in this way, has a nice panel with on AC and on battery and you can very easily tweak various settings accordingly. The hostile attitude from the Gnome developers in that thread was a real turnoff. While Gnome has excellent support for Wayland, it doesn t (directly) support fractional scaling. That is, you can set it to 100%, 200%, and so forth, but no 150%. Well, unless you manage to discover that you can run gsettings set org.gnome.mutter experimental-features "['scale-monitor-framebuffer']" first. (Oh wait, does that make a FIFTH settings tool? Why yes it does.) Despite its name, that allows you to select fractional scaling under Wayland. For X11 apps, they will be blurry, a problem that is optional under KDE (more on that below). Gnome won t show the battery life time remaining on the task bar. Yikes. An extension might work in some cases. Not only that, but the Gnome battery icon frequently failed to indicate AC charging when AC was connected, a problem that didn t exist on KDE. Both Gnome and KDE support night light (warmer color temperatures at night), but Gnome s often didn t change when it should have, or changed on one display but not the other. The appindicator extension is pretty much required, as otherwise a number of applications (eg, Nextcloud) don t have their icon display anywhere. It does, however, generate a significant amount of log spam. There may be a fix for this. Unlike KDE, which has a nice inobtrusive popup asking what to do, Gnome silently automounts USB sticks when inserted. This is often wrong; for instance, if I m about to dd a Debian installer to it, I definitely don t want it mounted. I learned this the hard way. It is particularly annoying because in a GUI, there is no reason to mount a drive before the user tries to access it anyhow. It looks like there is a dconf setting, but then to actually mount a drive you have to open up Files (because OF COURSE Gnome doesn t have a nice removable-drives icon like KDE does) and it s a bunch of annoying clicks, and I didn t want to use the GUI file manager anyway. Same for unmounting; two clicks in KDE thanks to the task bar icon, but in Gnome you have to open up the file manager, unmount the drive, close the file manager again, etc. The ssh agent on Gnome doesn t start up for a Wayland session, though this is easily enough worked around. The reason I completely soured on Gnome is that after using it for awhile, I noticed my laptop fans spinning up. One core would be constantly busy. It was busy with a kworker events task, something to do with sound events. Logging out would resolve it. I believe it to be a Gnome shell issue. I could find no resolution to this, and am unwilling to tolerate the decreased battery life this implies. The Gnome summary: it looks nice out of the box, but you quickly realize that this is something of a paper-thin illusion when you try to actually use it regularly. KDE The KDE experience on Wayland was a little bit opposite of Gnome. While with Gnome, things start out looking great but you realize there are some serious issues (especially battery-eating), with KDE things start out looking a tad rough but you realize you can trivially fix them and wind up with a very solid system. Compared to Gnome, KDE never had a battery-draining problem. It will show me estimated battery time remaining if I want it to. It will do whatever I want it to when I insert a USB drive. It doesn t muck with my CPU power settings, and lets me easily define on AC vs on battery settings for things like suspend when idle. KDE supports fractional scaling, to any arbitrary setting (even with the gsettings thing above, Gnome still only supports it in 25% increments). Then the question is what to do with X11-only applications. KDE offers two choices. The first is Scaled by the system , which is also the only option for Gnome. With that setting, the X11 apps effectively run natively at 100% and then are scaled up within Wayland, giving them a blurry appearance on HiDPI displays. The advantage is that the scaling happens within Wayland, so the size of the app will always be correct even when the Wayland scaling factor changes. The other option is Apply scaling themselves , which uses native X11 scaling. This lets most X11 apps display crisp and sharp, but then if the system scaling changes, due to limitations of X11, you ll have to restart the X apps to get them to be the correct size. I appreciate the choice, and use Apply scaling by themselves because only a few of my apps aren t Wayland-aware. I did encounter a few bugs in KDE under Wayland: sddm, the display manager, would be slow to stop and cause a long delay on shutdown or reboot. This seems to be a known issue with sddm and Wayland, and is easily worked around by adding a systemd TimeoutStopSec. Konsole, the KDE terminal emulator, has weird display artifacts when using fractional scaling under Wayland. I applied some patches and rebuilt Konsole and then all was fine. The Bismuth tiling extension has some pretty weird behavior under Wayland, but a 1-character patch fixes it. On Debian, KDE mysteriously installed Pulseaudio instead of Debian s new default Pipewire, but that was easily fixed as well (and Pulseaudio also works fine). Conclusions I m sticking with KDE. Given that I couldn t figure out how to stop Gnome from deciding to eat enough battery to make my fan come on, the decision wasn t hard. But even if it weren t for that, I d have gone with KDE. Once a couple of things were patched, the experience is solid, fast, and flawless. Emacs (my main X11-only application) looks great with the self-scaling in KDE. Gimp, which I use occasionally, was terrible with the blurry scaling in Gnome. Update: Corrected the gsettings command

11 September 2023

John Goerzen: For the First Time In Years, I m Excited By My Computer Purchase

Some decades back, when I d buy a new PC, it would unlock new capabilities. Maybe AGP video, or a PCMCIA slot, or, heck, sound. Nowadays, mostly new hardware means things get a bit faster or less crashy, or I have some more space for files. It s good and useful, but sorta meh. Not this purchase. Cory Doctorow wrote about the Framework laptop in 2021:

There s no tape. There s no glue. Every part has a QR code that you can shoot with your phone to go to a service manual that has simple-to-follow instructions for installing, removing and replacing it. Every part is labeled in English, too! The screen is replaceable. The keyboard is replaceable. The touchpad is replaceable. Removing the battery and replacing it takes less than five minutes. The computer actually ships with a screwdriver.

Framework had been on my radar for awhile. But for various reasons, when I was ready to purchase, I didn t; either the waitlist was long, or they didn t have the specs I wanted. Lately my aging laptop with 8GB RAM started OOMing (running out of RAM). My desktop had developed a tendency to hard hang about once a month, and I researched replacing it, but the cost was too high to justify. But when I looked into the Framework, I thought: this thing could replace both. It is a real shift in perspective to have a laptop that is nearly as upgradable as a desktop, and can be specced out to exactly what I wanted: 2TB storage and 64GB RAM. And still cheaper than a Macbook or Thinkpad with far lower specs, because the Framework uses off-the-shelf components as much as possible. Cory Doctorow wrote, in The Framework is the most exciting laptop I ve ever broken:

The Framework works beautifully, but it fails even better Framework has designed a small, powerful, lightweight machine it works well. But they ve also designed a computer that, when you drop it, you can fix yourself. That attention to graceful failure saved my ass.

I like small laptops, so I ordered the Framework 13. I loaded it up with the 64GB RAM and 2TB SSD I wanted. Frameworks have four configurable ports, which are also hot-swappable. I ordered two USB-C, one USB-A, and one HDMI. I put them in my preferred spots (one USB-C on each side for easy docking and charging). I put Debian on it, and it all Just Worked. Perfectly. Now, I orderd the DIY version. I hesitated about this I HATE working with laptops because they re all so hard, even though I KNEW this one was different but went for it, because my preferred specs weren t available in a pre-assembled model. I m glad I did that, because assembly was actually FUN. I got my box. I opened it. There was the bottom shell with the motherboard and CPU installed. Here are the RAM sticks. There s the SSD. A minute or two with each has them installed. Put the bezel on the screen, attach the keyboard it has magnets to guide it into place and boom, ready to go. Less than 30 minutes to assemble a laptop nearly from scratch. It was easier than assembling most desktops. So now, for the first time, my main computing device is a laptop. Rather than having a desktop and a laptop, I just have a laptop. I ll be able to upgrade parts of it later if I want to. I can rearrange the ports. And I can take all my most important files with me. I m quite pleased!

10 September 2023

Freexian Collaborators: Debian Contributions: /usr-merge updates, Salsa CI progress, DebConf23 lead-up, and more! (by Utkarsh Gupta)

Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

/usr-merge work, by Helmut Grohne, et al. Given that we now have consensus on moving forward by moving aliased files from `/` to `/usr`, we will also run into the problems that the file move moratorium was meant to prevent. The way forward is detecting them early and applying workarounds on a per-package basis. Said detection is now automated using the Debian Usr Merge Analysis Tool. As problems are reported to the bug tracking system, they are connected to the reports if properly usertagged. Bugs and patches for problem categories DEP17-P2 and DEP17-P6 have been filed. After consensus has been reached on the bootstrapping matters, `debootstrap` has been changed to swap the initial unpack and merging to avoid unpack errors due to pre-existing links. This is a precondition for having `base-files` install the aliasing symbolic links eventually. It was identified that the root filesystem used by the Debian installer is still unmerged and a change has been proposed. `debhelper` was changed to recognize systemd units installed to /usr. A discussion with the CTTE and release team on repealing the moratorium has been initiated.

Salsa CI work, by Santiago Ruano Rinc n August was a busy month in the Salsa CI world. Santiago reviewed and merged a bunch of MRs that have improved the project in different aspects: The aptly job got two MRs from Philip Hands. With the first one, the aptly now can export a couple of variables in a dotenv file, and with the second, it can include packages from multiple artifact directories. These MRs bring the base to improve how to test reverse dependencies with Salsa CI. Santiago is working on documenting this. As a result of the mass bug filing done in August, Salsa CI now includes a job to test how a package builds twice in a row. Thanks to the MRs of Sebastiaan Couwenberg and Johannes Schauer Marin Rodrigues. Last but not least, Santiago helped Johannes Schauer Marin Rodrigues to complete the support for arm64-only pipelines.

DebConf23 lead-up, by Stefano Rivera Stefano wears a few hats in the DebConf organization and in the lead up to the conference in mid-September, they ve all been quite busy. As one of the treasurers of DebConf 23, there has been a final budget update, and quite a few payments to coordinate from Debian s Trusted Organizations. We try to close the books from the previous conference at the next one, so a push was made to get DebConf 22 account statements out of TOs and record them in the conference ledger. As a website developer, we had a number of registration-related tasks, emailing attendees and trying to estimate numbers for food and accommodation. As a conference committee member, the job was mostly taking calls and helping the local team to make decisions on urgent issues. For example, getting conference visas issued to attendees required getting political approval from the Indian government. We only discovered the full process for this too late to clear some complex cases, so this required some hard calls on skipping some countries from the application list, allowing everyone else to get visas in time. Unfortunate, but necessary.

Miscellaneous contributions

Rapha l Hertzog updated gnome-shell-extension-hamster to a new upstream git snapshot that is compatible with GNOME Shell 44 that was recently uploaded to Debian unstable/testing. This extension makes it easy to start/stop tracking time with Hamster Time Tracker. Very handy for consultants like us who are billing their work per hour.

Rapha l also updated zim to the latest upstream release (0.74.2). This is a desktop wiki that can be very useful as a note-taking tool to build your own personal knowledge base or even to manage your personal todo lists.

Utkarsh reviewed and sponsored some uploads from mentors.debian.net.

Utkarsh helped the local team and the bursary team with some more DebConf activities and helped finalize the data.

Thorsten tried to update package hplip. Unfortunately upstream added some new compressed files that need to appear uncompressed in the package. Even though this sounded like an easy task, which seemed to be already implemented in the current debian/rules, the new type of files broke this implementation and made the package no longer buildable. The problem has been solved and the upload will happen soon.

Helmut sent 7 patches for cross build failures. Since `dpkg-buildflags` now defaults to issue `arm64`-specific compiler flags, more care is needed to distinguish between build architecture flags and host architecture flags than previously.

Stefano pushed the final bit of the tox 4 transition over the line in Debian, allowing dh-python and tox 4 to migrate to testing. We got caught up in a few unusual bugs in tox and the way we run it in Debian package building (which had to change with tox 4). This resulted in a couple of patches upstream.

Stefano visited Haifa, Israel, to see the proposed DebConf 24 venue and meet with the local team. While the venue isn t committed yet, we have high hopes for it.

29 August 2023

Jonathan Dowland: Gazelle Twin

A couple of releases

I discovered Gazelle Twin last year via Stuart Maconie's Freak Zone (two of her tracks ended up on my 2022 Halloween playlist). Through her website I learned of The Horror Show! exhibition at Somerset House¹ in London that I managed to visit earlier this year. I've been intending to write a 5-track blog post (a la Underworld, the Cure, Coil) for a while but I have been spurred on by the excellent news that she's got a new album on the way, and, she's performing at the Sage Gateshead in November. Buy tickets now!! Here's the five tracks I recommend to get started:

Anti-Body, from 2014's UNFLESH. I particularly love the percussion. Perc did a good hard-house-style remix on Fleshed Out, the companion remix album. Anti-Body by Gazelle Twin
Fire Leap, from Gazelle Twin and NYX's collaborative album Deep England. The album is a re-interpretation of material from Gazelle Twin's earlier album, Pastoral, with the exception of this track, which is a cover of Paul Giovanni's song from The Wicker Man. There's a common aesthetic in all three works: eerie-folk, England's self-mythologising as seen through a warped and cracked lens. Anti-Body by Gazelle Twin There's a fantastic performance video of the whole album, directed by Iain Forsyth and Jane Pollard, available on YouTube.
Better In My Day, from the aforementioned Pastoral. This track and this album are, I think, less accessible, more challenging than the re-interpreted material. That's not a bad thing: I'm still working on digesting it! This is one of the more abrasive, confrontational tracks. Better In My Day by Gazelle Twin
I am Shell I am Bone, from way back to her first release, The Entire City in 2011. Composed, recorded, self-produced, self-released. It's remarkable to me how different each phase of GT's work are to one another. This album evokes a strong sense of atmosphere and place to me. There's a hint of possible influence of Joy Division's Unknown Pleasures (or New Order's Movement) in places. The b-side to this song is a cover of Joy Division's The Eternal. I Am Shell I Am Bone by Gazelle Twin
GT re-issued This Entire City in 2022 along with a companion-piece EP of newly-released material from the same era, The Wastelands. This isn't cutting room floor stuff, though, as evidenced by the strength of my final pick, Hole in my Heart. Hole In My Heart by Gazelle Twin

It's hard to pick just five tracks when doing these (that's the point I suppose). I note that I haven't picked anything from her wide-ranging soundtrack work: her last three or four of her releases have been soundtrack works, released on the well-respected UK label Invada, as will be her forthcoming album. You can find all the released stuff on Gazelle Twin's Bandcamp Page.

I recently learned that PJ Harvey once hosted album sessions in Somerset House, and allowed fans to watch her at work during the recording: https://www.theguardian.com/music/2015/jan/16/pj-harvey-somerset-house-recording-in-progress

27 August 2023

Shirish Agarwal: FSCKing /home

There is a bit of context that needs to be shared before I get to this and would be a long one. For reasons known and unknown, I have a lot of sudden electricity outages. Not just me, all those who are on my line. A discussion with a lineman revealed that around 200+ families and businesses are on the same line and when for whatever reason the electricity goes for all. Even some of the traffic lights don t work. This affects software more than hardware or in some cases, both. And more specifically HDD s are vulnerable. I had bought an APC unit several years for precisely this, but over period of time it just couldn t function and trips also when the electricity goes out. It s been 6-7 years so can t even ask customer service to fix the issue and from whatever discussions I have had with APC personnel, the only meaningful difference is to buy a new unit but even then not sure this is an issue that can be resolved, even with that. That comes to the issue that happens once in a while where the system fsck is unable to repair /home and you need to use an external pen drive for the same. This is my how my hdd stacks up
/ is on dev/sda7 /boot is on /dev/sda6, /boot/efi is on /dev/sda2 and /home is on /dev/sda8 so theoretically, if /home for some reason doesn t work I should be able drop down on /dev/sda7, unmount /dev/sda8, run fsck and carry on with my work. I tried it number of times but it didn t work. I was dropping down on tty1 and attempting the same, no dice as root/superuser getting the barest x-term. So first I tried asking couple of friends who live nearby me. Unfortunately, both are MS-Windows users and both use what are called as company-owned laptops . Surfing on those systems were a nightmare. Especially the number of pop-ups of ads that the web has become. And to think about how much harassment ublock origin has saved me over the years. One of the more interesting bits from both their devices were showing all and any downloads from fosshub was showing up as malware. I dunno how much of that is true or not as haven t had to use it as most software we get through debian archives or if needed, download from github or wherever and run/install it and you are in business. Some of them even get compiled into a good .deb package but that s outside the conversation atm. My only experience with fosshub was few years before the pandemic and that was good. I dunno if fosshub really has malware or malwarebytes was giving false positives. It also isn t easy to upload a 600 MB+ ISO file somewhere to see whether it really has malware or not. I used to know of a site or two where you could upload a suspicious file and almost 20-30 famous and known antivirus and anti-malware engines would check it and tell you the result. Unfortunately, I have forgotten the URL and seeing things from MS-Windows perspective, things have gotten way worse than before. So left with no choice, I turned to the local LUG for help. Fortunately, my mobile does have e-mail and I could use gmail to solicit help. While there could have been any number of live CD s that could have helped but one of my first experiences with GNU/Linux was that of Knoppix that I had got from Linux For You (now known as OSFY) sometime in 2003. IIRC, had read an interview of Mr. Klaus Knopper as well and was impressed by it. In those days, Debian wasn t accessible to non-technical users then and Knoppix was a good tool to see it. In fact, think he was the first to come up with the idea of a Live CD and run with it while Canonical/Ubuntu took another 2 years to do it. I think both the CD and the interview by distrowatch was shared by LFY in those early days. Of course, later the story changes after he got married, but I think that is more about Adriane rather than Knoppix. So Vishal Rao helped me out. I got an HP USB 3.2 32GB Type C OTG Flash Drive x5600c (Grey & Black) from a local hardware dealer around similar price point. The dealer is a big one and has almost 200+ people scattered around the city doing channel sales who in turn sell to end users. Asking one of the representatives about their opinion on stopping electronic imports (apparently more things were added later to the list including all sorts of sundry items from digital cameras to shavers and whatnot.) The gentleman replied that he hopes that it would not happen otherwise more than 90% would have to leave their jobs. They already have started into lighting fixtures (LED bulbs, tubelights etc.) but even those would come in the same ban

The main argument as have shared before is that Indian Govt. thinks we need our home grown CPU and while I have no issues with that, as shared before except for RISC-V there is no other space where India could look into doing that. Especially after the Chip Act, Biden has made that any new fabs or any new thing in chip fabrication will only be shared with Five Eyes only. Also, while India is looking to generate about 2000 GW by 2030 by solar, China has an ambitious 20,000 GW generation capacity by the end of this year and the Chinese are the ones who are actually driving down the module prices. The Chinese are also automating their factories as if there s no tomorrow. The end result of both is that China will continue to be the world s factory floor for the foreseeable future and whoever may try whatever policies, it probably is gonna be difficult to compete with them on prices of electronic products. That s the reason the U.S. has been trying so that China doesn t get the latest technology but that perhaps is a story for another day.

HP USB 3.2 Type C OTG Flash Drive x5600c For people who have had read this blog they know that most of the flash drives today are MLC Drives and do not have the longevity of the SLC Drives. For those who maybe are new, this short brochure/explainer from Kingston should enhance your understanding. SLC Drives are rare and expensive. There are also a huge number of counterfeit flash drives available in the market and almost all the companies efforts whether it s Kingston, HP or any other manufacturer, they have been like a drop in the bucket. Coming back to the topic at hand. While there are some tools that can help you to figure out whether a pen drive is genuine or not. While there are products that can tell you whether they are genuine or not (basically by probing the memory controller and the info. you get from that.) that probably is a discussion left for another day. It took me couple of days and finally I was able to find time to go Vishal s place. The journey of back and forth lasted almost 6 hours, with crazy traffic jams. Tells you why Pune or specifically the Swargate, Hadapsar patch really needs a Metro. While an in-principle nod has been given, it probably is more than 5-7 years or more before we actually have a functioning metro. Even the current route the Metro has was supposed to be done almost 5 years to the date and even the modified plan was of 3 years ago. And even now, most of the Stations still need a lot of work to be done. PMC, Deccan as examples etc. still have loads to be done. Even PMT (Pune Muncipal Transport) that that is supposed to do the last mile connections via its buses has been putting half-hearted attempts

Vishal Rao While Vishal had apparently seen me and perhaps we had also interacted, this was my first memory of him although we have been on a few boards now and then including stackexchange. He was genuine and warm and shared 4-5 distros with me, including Knoppix and System Rescue as shared by Arun Khan. While this is and was the first time I had heard about Ventoy apparently Vishal has been using it for couple of years now. It s a simple shell script that you need to download and run on your pen drive and then just dump all the .iso images. The easiest way to explain ventoy is that it looks and feels like Grub. Which also reminds me an interaction I had with Vishal on mobile. While troubleshooting the issue, I was unsure whether it was filesystem that was the issue or also systemd was corrupted. Vishal reminded me of putting fastboot to the kernel parameters to see if I m able to boot without fscking and get into userspace i.e. /home. Although journalctl and systemctl were responding even on tty1 still was a bit apprehensive. Using fastboot was able to mount the whole thing and get into userspace and that told me that it s only some of the inodes that need clearing and there probably are some orphaned inodes. While Vishal had got a mini-pc he uses that a server, downloads stuff to it and then downloads stuff from it. From both privacy, backup etc. it is a better way to do things but then you need to laptop to access it. I am sure he probably uses it for virtualization and other ways as well but we just didn t have time for that discussion. Also a mini-pc can set you back anywhere from 25 to 40k depending on the mini-pc and the RAM and the SSD. And you need either a lappy or an Raspberry Pi with some kinda visual display to interact with the mini-pc. While he did share some of the things, there probably could have been a far longer interaction just on that but probably best left for another day. Now at my end, the system I had bought is about 5-6 years old. At that time it only had 6 USB 2.0 drives and 2 USB 3.0 (A) drives.
The above image does tell of the various form factors. One of the other things is that I found the pendrive and its connectors to be extremely fiddly. It took me number of times fiddling around with it when I was finally able to put in and able to access the pen drive partitions. Unfortunately, was unable to see/use systemrescue but Knoppix booted up fine. I mounted the partitions briefly to see where is what and sure enough /dev/sda8 showed my /home files and folders. Unmounted it, then used $fsck -y /dev/sda8 and back in business. This concludes what happened. Updates Quite a bit was left out on the original post, part of which I didn t know and partly stuff which is interesting and perhaps need a blog post of their own. It s sad I won t be part of debconf otherwise who knows what else I would have come to know.

One of the interesting bits that I came to know about last week is the Alibaba T-Head T-Head TH1520 RISC-V CPU and saw it first being demoed on a laptop and then a standalone tablet. The laptop is an interesting proposition considering Alibaba opened up it s chip thing only couple of years ago. To have an SOC within 18 months and then under production for lappies and tablets is practically unheard of especially of a newbie/startup. Even AMD took 3-4 years for its first chip.It seems they (Alibaba) would be parceling them out by quarter end 2023 and another 1000 pieces/Units first quarter next year, while the scale is nothing compared to the behemoths, I think this would be more as a matter of getting feedback on both the hardware and software. The value proposition is much better than what most of us get, at least in India. For example, they are doing a warranty for 5 years and also giving spare parts. RISC-V has been having a lot of resurgence in China in part as its an open standard and partly development will be far cheaper and faster than trying x86 or x86-64. If you look into both the manufacturers, due to monopoly, both of them now give 5-8% increment per year, and if you look back in history, you would find that when more chips were in competition, they used to give 15-20% performance increment per year.

2. While Vishal did share with me what he used and the various ways he uses the mini-pc, I did have a fun speculating on what he could use it. As shared by Romane as his case has shared, the first thing to my mind was backups. Filesystems are notorious in the sense they can be corrupted or can be prone to be corrupted very easily as can be seen above . Backups certainly make a lot of sense, especially rsync. The other thing that came to my mind was having some sort of A.I. and chat server. IIRC, somebody has put quite a bit of open source public domain data in debian servers that could be used to run either a chatbot or an A.I. or both and use that similar to how chatGPT but with much limited scope than what chatgpt uses. I was also thinking a media server which Vishal did share he does. I may probably visit him sometime to see what choices he did and what he learned in the process, if anything. Another thing that could be done is just take a dump of any of commodity markets or any markets and have some sort of predictive A.I. or whatever. A whole bunch of people have scammed thousands of Indian users on this, but if you do it on your own and for your own purposes to aid you buy and sell stocks or whatever commodity you may fancy. After all, nowadays markets themselves are virtual. While Vishal s mini-pc doesn t have any graphics, if it was an AMD APU mini-pc, something like this he could have hosted games in the way of thick server, thin client where all graphics processing happens on the server rather than the client. With virtual reality I think the case for the same case could be made or much more. The only problem with VR/AR is that we don t really have mass-market googles, eye pieces or headset. The only notable project that Google has/had in that place is the Google VR Cardboard headset and the experience is not that great or at least was not that great few years back when I could hear and experience the same. Most of the VR headsets say for example the Meta Quest 2 is for around INR 44k/- while Quest 3 is INR 50k+ and officially not available. As have shared before, the holy grail of VR would be when it falls below INR 10k/- so it becomes just another accessory, not something you really have to save for. There also isn t much content on that but then that is also the whole chicken or egg situation. This again is a non-stop discussion as so much has been happening in that space it needs its own blog post/article whatever. Till later.

21 August 2023

Melissa Wen: AMD Driver-specific Properties for Color Management on Linux (Part 1)

TL;DR: Color is a visual perception. Human eyes can detect a broader range of colors than any devices in the graphics chain. Since each device can generate, capture or reproduce a specific subset of colors and tones, color management controls color conversion and calibration across devices to ensure a more accurate and consistent color representation. We can expose a GPU-accelerated display color management pipeline to support this process and enhance results, and this is what we are doing on Linux to improve color management on `Gamescope/SteamDeck`. Even with the challenges of being external developers, we have been working on mapping `AMD GPU color capabilities` to `the Linux kernel color management interface`, which is a combination of DRM and AMD driver-specific color properties. This more extensive color management pipeline includes `pre-defined Transfer Functions`, `1-Dimensional LookUp Tables (1D LUTs)`, and `3D LUTs` before and after the plane composition/blending.
The study of color is well-established and has been explored for many years. Color science and research findings have also guided technology innovations. As a result, color in Computer Graphics is a very complex topic that I m putting a lot of effort into becoming familiar with. I always find myself rereading all the materials I have collected about color space and operations since I started this journey (about one year ago). I also understand how hard it is to find consensus on some color subjects, as exemplified by all explanations around the 2015 online viral phenomenon of The Black and Blue Dress. Have you heard about it? What is the color of the dress for you? So, taking into account my skills with colors and building consensus, this blog post only focuses on GPU hardware capabilities to support color management :-D If you want to learn more about color concepts and color on Linux, you can find useful links at the end of this blog post.

Linux Kernel, show me the colors ;D DRM color management interface only exposes a small set of post-blending color properties. Proposals to enhance the DRM color API from different vendors have landed the subsystem mailing list over the last few years. On one hand, we got some suggestions to extend DRM post-blending/CRTC color API: DRM CRTC 3D LUT for R-Car (2020 version); DRM CRTC 3D LUT for Intel (draft - 2020); DRM CRTC 3D LUT for AMD by Igalia (v2 - 2023); DRM CRTC 3D LUT for R-Car (v2 - 2023). On the other hand, some proposals to extend DRM pre-blending/plane API: DRM plane colors for Intel (v2 - 2021); DRM plane API for AMD (v3 - 2021); DRM plane 3D LUT for AMD - 2021. Finally, Simon Ser sent the latest proposal in May 2023: Plane color pipeline KMS uAPI, from discussions in the 2023 Display/HDR Hackfest, and it is still under evaluation by the Linux Graphics community. All previous proposals seek a generic solution for expanding the API, but many seem to have stalled due to the uncertainty of matching well the hardware capabilities of all vendors. Meanwhile, the use of AMD color capabilities on Linux remained limited by the DRM interface, as the `DCN 3.0 family color caps and mapping` diagram below shows the Linux/DRM color interface without driver-specific color properties [*]: Bearing in mind that we need to know the variety of color pipelines in the subsystem to be clear about a generic solution, we decided to approach the issue from a different perspective and worked on enabling a set of `Driver-Specific Color Properties for AMD Display Drivers`. As a result, I recently sent another round of the AMD driver-specific color mgmt API. For those who have been following the AMD driver-specific proposal since the beginning (see [RFC][V1]), the main new features of the latest version [v2] are the addition of `pre-blending Color Transformation Matrix (plane CTM)` and the differentiation of `Pre-defined Transfer Functions (TF)` supported by color blocks. For those who just got here, I will recap this work in two blog posts. This one describes the current status of the AMD display driver in the Linux kernel/DRM subsystem and what changes with the driver-specific properties. In the next post, we go deeper to describe the features of each color block and provide a better picture of what is available in terms of color management for Linux.

The Linux kernel color management API and AMD hardware color capabilities Before discussing colors in the Linux kernel with AMD hardware, consider accessing the Linux kernel documentation (version 6.5.0-rc5). In the AMD Display documentation, you will find my previous work documenting AMD hardware color capabilities and the Color Management Properties. It describes how `AMD Display Manager (DM)` intermediates requests between the `AMD Display Core component (DC)` and the `Linux/DRM kernel` interface for color management features. It also describes the relevant function to call the AMD color module in building curves for content space transformations. A subsection also describes hardware color capabilities and how they evolve between versions. This subsection, DC Color Capabilities between DCN generations, is a good starting point to understand what we have been doing on the kernel side to provide a broader color management API with AMD driver-specific properties.

Why do we need more kernel color properties on Linux? Blending is the process of combining multiple planes (framebuffers abstraction) according to their mode settings. Before blending, we can manage the colors of various planes separately; after blending, we have combined those planes in only one output per CRTC. Color conversions after blending would be enough in a single-plane scenario or when dealing with planes in the same color space on the kernel side. Still, it cannot help to handle the blending of multiple planes with different color spaces and luminance levels. With plane color management properties, userspace can get a more accurate representation of colors to deal with the diversity of color profiles of devices in the graphics chain, bring a `wide color gamut (WCG)`, convert `High-Dynamic-Range (HDR)` content to `Standard-Dynamic-Range (SDR)` content (and vice-versa). With a GPU-accelerated display color management pipeline, we can use hardware blocks for color conversions and color mapping and support advanced color management. The current DRM color management API enables us to perform some color conversions after blending, but there is no interface to calibrate input space by planes. Note that here I m not considering some workarounds in the AMD display manager mapping of DRM CRTC de-gamma and DRM CRTC CTM property to pre-blending DC de-gamma and gamut remap block, respectively. So, in more detail, it only exposes three post-blending features:

DRM CRTC de-gamma: used to convert the framebuffer s colors to linear gamma;

DRM CRTC CTM: used for color space conversion;

DRM CRTC gamma: used to convert colors to the gamma space of the connected screen.

AMD driver-specific color management interface We can compare the Linux color management API with and without the driver-specific color properties. From now, we denote driver-specific properties with the AMD prefix and generic properties with the DRM prefix. For visual comparison, I bring the `DCN 3.0 family color caps and mapping` diagram closer and present it here again: Mixing AMD driver-specific color properties with DRM generic color properties, we have a broader Linux color management system with the following features exposed by properties in the plane and CRTC interface, as summarized by this updated diagram: The blocks highlighted by `red lines` are the `new properties` in the driver-specific interface developed by me (Igalia) and Joshua (Valve). The `red dashed lines` are new `links between API and AMD driver components` implemented by us to connect the Linux/DRM interface to AMD hardware blocks, mapping components accordingly. In short, we have the following color management properties exposed by the DRM/AMD display driver:

Pre-blending - AMD Display Pipe and Plane (DPP):

AMD plane de-gamma: 1D LUT and pre-defined transfer functions; used to linearize the input space of a plane;

AMD plane CTM: 3x4 matrix; used to convert plane color space;

AMD plane shaper: 1D LUT and pre-defined transfer functions; used to delinearize and/or normalize colors before applying 3D LUT;

AMD plane 3D LUT: 17x17x17 size with 12 bit-depth; three dimensional lookup table used for advanced color mapping;

AMD plane blend/out gamma: 1D LUT and pre-defined transfer functions; used to linearize back the color space after 3D LUT for blending.

Post-blending - AMD Multiple Pipe/Plane Combined (MPC):

DRM CRTC de-gamma: 1D LUT (can t be set together with plane de-gamma);

DRM CRTC CTM: 3x3 matrix (remapped to post-blending matrix);

DRM CRTC gamma: 1D LUT + AMD CRTC gamma TF; added to take advantage of driver pre-defined transfer functions;

Note: You can find more about AMD display blocks in the Display Core Next (DCN) - Linux kernel documentation, provided by Rodrigo Siqueira (Linux/AMD display developer) in a 2021-documentation series. In the next post, I ll revisit this topic, explaining display and color blocks in detail.

How did we get a large set of color features from AMD display hardware? So, looking at AMD hardware color capabilities in the first diagram, we can see no post-blending (MPC) de-gamma block in any hardware families. We can also see that the AMD display driver maps CRTC/post-blending CTM to pre-blending (DPP) gamut_remap, but there is post-blending (MPC) gamut_remap (DRM CTM) from newer hardware versions that include SteamDeck hardware. You can find more details about hardware versions in the Linux kernel documentation/AMDGPU Product Information. I needed to rework these two mappings mentioned above to provide pre-blending/plane de-gamma and CTM for SteamDeck. I changed the DC mapping to detach `stream gamut remap` matrixes from the `DPP gamut remap` block. That means mapping AMD plane CTM directly to DPP/pre-blending gamut remap block and DRM CRTC CTM to MPC/post-blending gamut remap block. In this sense, I also limited plane CTM properties to those hardware versions with MPC/post-blending gamut_remap capabilities since older versions cannot support this feature without clashes with DRM CRTC CTM. Unfortunately, I couldn t prevent conflict between AMD plane de-gamma and DRM plane de-gamma since post-blending de-gamma isn t available in any AMD hardware versions until now. The fact is that a post-blending de-gamma makes little sense in the AMD color pipeline, where plane blending works better in a linear space, and there are enough color blocks to linearize content before blending. To deal with this conflict, the driver now rejects atomic commits if users try to set both AMD plane de-gamma and DRM CRTC de-gamma simultaneously. Finally, we had no other clashes when enabling other AMD driver-specific color properties for our use case, Gamescope/SteamDeck. Our main work for the remaining properties was understanding the data flow of each property, the hardware capabilities and limitations, and how to shape the data for programming the registers - AMD color block capabilities (and limitations) are the topics of the next blog post. Besides that, we fixed some driver bugs along the way since it was the first Linux use case for most of the new color properties, and some behaviors are only exposed when exercising the engine. Take a look at the Gamescope/Steam Deck Color Pipeline[**], and see how Gamescope uses the new API to manage color space conversions and calibration (please click on the image for a better view): In the next blog post, I ll describe the implementation and technical details of each pre- and post-blending color block/property on the AMD display driver. * Thank Harry Wentland for helping with diagrams, color concepts and AMD capabilities. Thank Joshua Ashton for providing and explaining Gamescope/Steam Deck color pipeline. * Thanks to the Linux Graphics community - explicitly Harry, Joshua, Pekka, Simon, Sebastian, Siqueira, Alex H. and Ville - to all the learning during this Linux DRM/AMD color journey. Also, Carlos and Tomas for organizing the 2023 Display/HDR Hackfest where we have a great and immersive opportunity to discuss Color & HDR on Linux.

Useful Links

Cinematic Color - 2012 SIGGRAPH course notes by Jeremy Selan: an introduction to color science, concepts and pipelines.

Color management and HDR documentation for FOSS graphics by Pekka Paalanen: documentation and useful links on applying color concepts to the Linux graphics stack.

HDR in Linux by Jeremy Cline: a blog post exploring color concepts for HDR support on Linux.

Methods for conversion of high dynamic range content to standard dynamic range content and vice-versa by ITU-R: guideline for conversions between HDR and SDR contents.

Using Lookup Tables to Accelerate Color Transformations by Jeremy Selan: Nvidia blog post about Lookup Tables on color management.

The Importance of Being Linear by Larry Gritz and Eugene d Eon: Nvidia blog post about gamma and color conversions.

4 August 2023

Louis-Philippe V ronneau: pymonitair: Air Quality Monitoring Display with MicroPython

I've never been a fan of IoT devices for obvious reasons: not only do they tend to be excellent at being expensive vendor locked-in machines, but far too often, they also end up turning into e-waste after a short amount of time. Manufacturers can go out of business or simply decide to shut down the cloud servers for older models, and then you're stuck with a brick. Well, this all changes today, as I've built my first IoT device and I love it. Introducing pymonitair. What pymonitair is a MicroPython project that aims to display weather data from a home weather station (like the ones sold by AirGradient) on a small display. The source code was written for the Raspberry Pi Pico W, the Waveshare Pico OLED 1.3 display and the RevolvAir Revo 1 weather station, but can be adapted to other displays and stations easily, as I tried to keep the code as modular as possible. The general MicroPython code itself isn't specific to the Raspberry Pi Pico and shouldn't need to be modified for other boards. pymonitair features:

6 different pages for the supported weather data¹, accessible via the key0 button
Alerting² on the PM2.5 page when the defined threshold has been crossed
Automatic screen blanking after a defined amount of time to save the OLED screen from burn-in
Manual screen blanking using the key1 button

Here's a demo of me scrolling through the different pages and (somewhat failing) to turn the screen on and off:

Why? If you follow my blog, you'll know that my last entry was about building a set of tools to collect and graph data from a weather station my neighbor set up. Why on Earth would I need a separate device to show this data, when the website I've built works perfectly fine and is accessible on any computer or smartphone? Mostly alerts. When the air quality here dropped following forest fires, I found out keeping track of if I had to close my windows and bunker down was quite a hassle. Air quality would degrade during the day and I would only notice it hours later. With the pymonitair, I'll have a little screen flashing angrily at me whenever this happens. A simpler solution would probably have been to forgo hardware altogether and code some icinga2 alert to ping me over Signal whenever the air quality got bad. Hacking on pymonitair was mostly a way to learn to use MicroPython and familiarize myself with this type of embedded hardware device. I'll surely blog about this later this year, but I plan to use a very similar stack to mod my apartment's HVAC unit to stop pulling air from outside when an air quality sensor detects cigarette smoke (or bad air quality in general). Things I've learnt This project was super fun and taught me many things:

MicroPython is close enough to regular Python that I don't mind it. The standard library is much smaller, but there are ways to work around that.
The Raspberry Pi Pico W is a fantastic board. At around 10 CAD (~7 USD), it's much cheaper than anything Arduino offers, much more powerful than those boards and gives you the option of not having to code in some terrible and weird C variant. It also has a nice ecosystem of compatible modules and hats.
Displays, even if they advertise being "batteries included", often are not. I had the bad surprise of discovering the MicroPython driver for the Waveshare Pico OLED 1.3 was pretty bad and lacked a ton of features compared to their C code driver. Thankfully, I was lucky enough that the OLED panel it uses (the SH1107) is common enough for folks to have written their own drivers.
thonny, a Python IDE, is probably the best tool to transfer files to the Raspberry Pico and to debug code, as it provides a file manager and a shell³. I'm a little annoyed by it not being vim, but I got used to its quirks fairly quickly.

PM1, PM2.5, PM10, Temperature, Humidity and Pressure
Part of the screen will flash repeatedly
I did look for other solutions to transfer files to the board, but none of them were actually maintained. I nearly finished packaging ampy before realising it was officially unmaintained and its main alternative, rshell, has had its last release in December 2021. When I caught myself seriously considering writing a script to transfer files over the serial link, I gave up and decided thonny was not that bad after all.

17 July 2023

Ben Hutchings: FOSS activity in June 2023

I uploaded sgt-puzzles to unstable. This brought in the new upstream version previously in experimental. I incorporated an updated German translation from Helge Kreutzmann, and made translation updates less tricky to do.
I made some changes to the nfs-utils package:
- Completed the transition from setting command-line options in /etc/default to /etc/nfs.conf.d.
- Made its shell scripts shellcheck-clean and added shellcheck to CI. (Thanks to who sent a patch for the init scripts.)
These were included in version 1:2.6.3-1.
I accepted several MRs on Salsa:
- linux/master: [armhf] drivers/staging/media/rkvdec: enable rkvdec as module
- linux/master: Update to 6.4-rcX
- linux/bookworm: udeb: add r8188eu to nic-wireless-modules (Closes: #1035824)
- nfs-utils/master: Rely on the generator units for the rpc_pipefs mount
- linux/master: mm: Enable Multi-Gen LRU implementation (by default)
- linux/master: d/rules.real: Also remove executable bit from dtbo files
- linux/master: [mips*]: Fixes for boston kernel
- linux/sid: Ignore ABI changes for xfrm_bpf_md_dst (only for use in xfrm subsystem)
I uploaded linux versions 6.4~rc6-1~exp1 and 6.4~rc7-1~exp1 to experimental.
I updated the buster-security (4.19) branch of linux to stable version 4.19.288, but didn't upload it this month.
I fixed build regressions for linux/experimental on several architectures, and sent the changes upstream where appropriate (hppa, m68k, and preemptively sparc).
I created a bookworm-backports branch for the linux package, but that suite is not yet open to uploads.
I uploaded linux version 6.1.27-1~bpo11+1 and firmware-nonfree version 20230210-5~bpo11+1 to bullseye-backports, but they still haven't been accepted.
I fixed a build regression and many other bugs in dadhi-linux.
I realised that the linux test-patches script still wasn't building all the packages needed to make the linux-headers package (or, on some architectures, the linux-image package) installable. I fixed this for unstable, backported those changes to bookworm, and backported all the test-patches changes to bullseye.
I prepared a backport of firmware-nonfree version 20220913-1 to bullseye (not -backports). This is based on the work Tobias Frost did to update it in buster-security (Debian LTS).
I updated the Debian Kernel Handbook to use the Debian stylesheet. This is now in the live version but I haven't uploaded the package.
I started backporting various kernel security fixes to the affected stable branches. These are not yet tested or submitted upstream.
I fixed a regression in cpu_rmap in the Linux kernel.
I reported a regression in rtl8192 on Linux stable branches.
I co-organised the Debian release party in Leuven. and posted a group photo of this on Mastodon.
The amdgpu driver lists some firmware files as potentially needed that aren't packaged or even publicly available, which leads to warnings from initramfs-tools on systems using this driver. I queried these upstream, which should hopefully lead to a resolution of the bug.
I wrote the arm, mips, and riscv parts of the fix for CVE-2023-3269 a.k.a. StackRot.

13 July 2023

Emmanuel Kasper: Debian 11 to Debian 12 (Bookworm) Upgrade Report

Laptop + Workstation My workstation was initially installed with Debian 8 back in the day, so I might have carried a lot of configuration cruft.
Indeed. I followed the recommended upgrades documentation (apt upgrade --without-new-pkgs followed by apt full-upgrade). And when executing apt full-upgrade I had the following error:

Preparing to unpack .../71-python3-numpy_1%3a1.24.2-1_amd64.deb ...
Unpacking python3-numpy (1:1.24.2-1) over (1:1.19.5-1) ...
dpkg: error processing archive /tmp/apt-dpkg-install-ibI85v/71-python3-numpy_1%3a1.24.2-1_amd64.deb (--unpack):
 trying to overwrite '/usr/bin/f2py', which is also in package python-numpy 1:1.16.5-5

Deleting the python-numpy package and resuming the upgrade with apt --fix-broken install followed by apt full-upgrade allowed the upgrade to complete successfully. This was already metioned in a Debian bug report and would have been avoided if I had purged the locally obsolete packages after upgrading to Debian 11. On laptop and workstation, after the upgrade, for unclear reasons, the gnome3 user extensions were disabled. I reenabled the extensions manually with

gsettings set org.gnome.shell disable-user-extensions false

Finally podman, had a major upgrade from 3 to 4, and a backward-incompatible configuration change. If a custom configuration file was in place in /etc/containers/storage.conf to override the default storage options, you need now to add the following stanza

[storage]
runroot="/run/containers/storage"
graphRoot="/var/lib/containers/storage"

in that file.
Otherwise you ll get the error Failed to obtain podman configuration: runroot must be set when running any podman command. This was discussed upstream. Cloud server (VM) Everything worked flawlessly, nothing to report. Conclusion Again a great Debian release, very happy that I could update three systems with ten thousands of packages with so little fuss. For my small home server running RHEL 8 (with the no cost sub) I will do a reinstall on newer hardware.

11 July 2023

Simon Josefsson: Coping with non-free software in Debian

A personal reflection on how I moved from my Debian home to find two new homes with Trisquel and Guix for my own ethical computing, and while doing so settled my dilemma about further Debian contributions. Debian s contributions to the free software community has been tremendous. Debian was one of the early distributions in the 1990 s that combined the GNU tools (compiler, linker, shell, editor, and a set of Unix tools) with the Linux kernel and published a free software operating system. Back then there were little guidance on how to publish free software binaries, let alone entire operating systems. There was a lack of established community processes and conflict resolution mechanisms, and lack of guiding principles to motivate the work. The community building efforts that came about in parallel with the technical work has resulted in a steady flow of releases over the years. From the work of Richard Stallman and the Free Software Foundation (FSF) during the 1980 s and early 1990 s, there was at the time already an established definition of free software. Inspired by free software definition, and a belief that a social contract helps to build a community and resolve conflicts, Debian s social contract (DSC) with the free software community was published in 1997. The DSC included the Debian Free Software Guidelines (DFSG), which directly led to the Open Source Definition.

I was introduced to GNU/Linux through Slackware in the early 1990 s (oh boy those nights calculating XFree86 modeline s and debugging sendmail.cf) and primarily used RedHat Linux during ca 1995-2003. I switched to Debian during the Woody release cycles, when the original RedHat Linux was abandoned and Fedora launched. It was Debian s explicit community processes and infrastructure that attracted me. The slow nature of community processes also kept me using RedHat for so long: centralized and dogmatic decision processes often produce quick and effective outcomes, and in my opinion RedHat Linux was technically better than Debian ca 1995-2003. However the RedHat model was not sustainable, and resulted in the RedHat vs Fedora split. Debian catched up, and reached technical stability once its community processes had been grounded. I started participating in the Debian community around late 2006. My interpretation of Debian s social contract is that Debian should be a distribution of works licensed 100% under a free license. The Debian community has always been inclusive towards non-free software, creating the contrib/non-free section and permitting use of the bug tracker to help resolve issues with non-free works. This is all explained in the social contract. There has always been a clear boundary between free and non-free work, and there has been a commitment that the Debian system itself would be 100% free. The concern that RedHat Linux was not 100% free software was not critical to me at the time: I primarily (and happily) ran GNU tools on Solaris, IRIX, AIX, OS/2, Windows etc. Running GNU tools on RedHat Linux was an improvement, and I hadn t realized it was possible to get rid of all non-free software on my own primary machine. Debian realized that goal for me. I ve been a believer in that model ever since. I can use Solaris, macOS, Android etc knowing that I have the option of using a 100% free Debian. While the inclusive approach towards non-free software invite and deserve criticism (some argue that being inclusive to non-inclusive behavior is a bad idea), I believe that Debian s approach was a successful survival technique: by being inclusive to and a compromise between free and non-free communities, Debian has been able to stay relevant and contribute to both environments. If Debian had not served and contributed to the free community, I believe free software people would have stopped contributing. If Debian had rejected non-free works completely, I don t think the successful Ubuntu distribution would have been based on Debian. I wrote the majority of the text above back in September 2022, intending to post it as a way to argue for my proposal to maintain the status quo within Debian. I didn t post it because I felt I was saying the obvious, and that the obvious do not need to be repeated, and the rest of the post was just me going down memory lane. The Debian project has been a sustainable producer of a 100% free OS up until Debian 11 bullseye. In the resolution on non-free firmware the community decided to leave the model that had resulted in a 100% free Debian for so long. The goal of Debian is no longer to publish a 100% free operating system, instead this was added: The Debian official media may include firmware . Indeed the Debian 12 bookworm release has confirmed that this would not only be an optional possibility. The Debian community could have published a 100% free Debian, in parallel with the non-free Debian, and still be consistent with their newly adopted policy, but chose not to. The result is that Debian s policies are not consistent with their actions. It doesn t make sense to claim that Debian is 100% free when the Debian installer contains non-free software. Actions speaks louder than words, so I m left reading the policies as well-intended prose that is no longer used for guidance, but for the peace of mind for people living in ivory towers. And to attract funding, I suppose. So how to deal with this, on a personal level? I did not have an answer to that back in October 2022 after the vote. It wasn t clear to me that I would ever want to contribute to Debian under the new social contract that promoted non-free software. I went on vacation from any Debian work. Meanwhile Debian 12 bookworm was released, confirming my fears. I kept coming back to this text, and my only take-away was that it would be unethical for me to use Debian on my machines. Letting actions speak for themselves, I switched to PureOS on my main laptop during October, barely noticing any difference since it is based on Debian 11 bullseye. Back in December, I bought a new laptop and tried Trisquel and Guix on it, as they promise a migration path towards ppc64el that PureOS do not. While I pondered how to approach my modest Debian contributions, I set out to learn Trisquel and gained trust in it. I migrated one Debian machine after another to Trisquel, and started to use Guix on others. Migration was easy because Trisquel is based on Ubuntu which is based on Debian. Using Guix has its challenges, but I enjoy its coherant documented environment. All of my essential self-hosted servers (VM hosts, DNS, e-mail, WWW, Nextcloud, CI/CD builders, backup etc) uses Trisquel or Guix now. I ve migrated many GitLab CI/CD rules to use Trisquel instead of Debian, to have a more ethical computing base for software development and deployment. I wish there were official Guix docker images around. Time has passed, and when I now think about any Debian contributions, I m a little less muddled by my disappointment of the exclusion of a 100% free Debian. I realize that today I can use Debian in the same way that I use macOS, Android, RHEL or Ubuntu. And what prevents me from contributing to free software on those platforms? So I will make the occasional Debian contribution again, knowing that it will also indirectly improve Trisquel. To avoid having to install Debian, I need a development environment in Trisquel that allows me to build Debian packages. I have found a recipe for doing this:

# System commands:
sudo apt-get install debhelper git-buildpackage debian-archive-keyring
sudo wget -O /usr/share/debootstrap/scripts/debian-common https://sources.debian.org/data/main/d/debootstrap/1.0.128%2Bnmu2/scripts/debian-common
sudo wget -O /usr/share/debootstrap/scripts/sid https://sources.debian.org/data/main/d/debootstrap/1.0.128%2Bnmu2/scripts/sid
# Run once to create build image:
DIST=sid git-pbuilder create --mirror http://deb.debian.org/debian/ --debootstrapopts "--exclude=usr-is-merged" --basepath /var/cache/pbuilder/base-sid.cow
# Run in a directory with debian/ to build a package:
gbp buildpackage --git-pbuilder --git-dist=sid

How to sustainably deliver a 100% free software binary distributions seems like an open question, and the challenges are not all that different compared to the 1990 s or early 2000 s. I m hoping Debian will come back to provide a 100% free platform, but my fear is that Debian will compromise even further on the free software ideals rather than the opposite. With similar arguments that were used to add the non-free firmware, Debian could compromise the free software spirit of the Linux boot process (e.g., non-free boot images signed by Debian) and media handling (e.g., web browsers and DRM), as Debian have already done with appstore-like functionality for non-free software (Python pip). To learn about other freedom issues in Debian packaging, browsing Trisquel s helper scripts may enlight you. Debian s setback and the recent setback for RHEL-derived distributions are sad, and it will be a challenge for these communities to find internally consistent coherency going forward. I wish them the best of luck, as Debian and RHEL are important for the wider free software eco-system. Let s see how the community around Trisquel, Guix and the other FSDG-distributions evolve in the future. The situation for free software today appears better than it was years ago regardless of Debian and RHEL s setbacks though, which is important to remember! I don t recall being able install a 100% free OS on a modern laptop and modern server as easily as I am able to do today. Happy Hacking! Addendum 22 July 2023: The original title of this post was Coping with non-free Debian, and there was a thread about it that included feedback on the title. I do agree that my initial title was confrontational, and I ve changed it to the more specific Coping with non-free software in Debian. I do appreciate all the fine free software that goes into Debian, and hope that this will continue and improve, although I have doubts given the opinions expressed by the majority of developers. For the philosophically inclined, it is interesting to think about what it means to say that a compilation of software is freely licensed. At what point does a compilation of software deserve the labels free vs non-free? Windows probably contains some software that is published as free software, let s say Windows is 1% free. Apple authors a lot of free software (as a tangent, Apple probably produce more free software than what Debian as an organization produces), and let s say macOS contains 20% free software. Solaris (or some still maintained derivative like OpenIndiana) is mostly freely licensed these days, isn t it? Let s say it is 80% free. Ubuntu and RHEL pushes that closer to let s say 95% free software. Debian used to be 100% but is now slightly less at maybe 99%. Trisquel and Guix are at 100%. At what point is it reasonable to call a compilation free? Does Debian deserve to be called freely licensed? Does macOS? Is it even possible to use these labels for compilations in any meaningful way? All numbers just taken from thin air. It isn t even clear how this can be measured (binary bytes? lines of code? CPU cycles? etc). The caveat about license review mistakes applies. I ignore Debian s own claims that Debian is 100% free software, which I believe is inconsistent and no longer true under any reasonable objective analysis. It was not true before the firmware vote since Debian ships with non-free blobs in the Linux kernel for example.

9 July 2023

Russell Coker: Matrix

Introduction In 2020 I first setup a Matrix [1] server. Matrix is a full featured instant messaging protocol which requires a less stringent definition of instant , messages being delayed for minutes aren t that uncommon in my experience. Matrix is a federated service where the servers all store copies of the room data, so when you connect your client to it s home server it gets all the messages that were published while you were offline, it is widely regarded as being IRC but without a need to be connected all the time. One of it s noteworthy features is support for end to end encryption (so the server can t access cleartext messages from users) as a core feature. Matrix was designed for bridging with other protocols, the most well known of which is IRC. The most common Matrix server software is Synapse which is written in Python and uses a PostgreSQL database as it s backend [2]. My tests have shown that a lightly loaded Synapse server with less than a dozen users and only one or two active users will have noticeable performance problems if the PostgreSQL database is stored on SATA hard drives. This seems like the type of software that wouldn t have been developed before SSDs became commonly affordable. The matrix-synapse is in Debian/Unstable and the backports repositories for Bullseye and Buster. As Matrix is still being very actively developed you want to have a recent version of all related software so Debian/Buster isn t a good platform for running it, Bullseye or Bookworm are the preferred platforms. Configuring Synapse isn t really hard, but there are some postential problems. The first thing to do is to choose the DNS name, you can never change it without dropping the database (fresh install of all software and no documented way of keeping user configuration) so you don t want to get it wrong. Generally you will want the Matrix addresses at the top level of the domain you choose. When setting up a Matrix server for my local LUG I chose the top level of their domain luv.asn.au as the DNS name for the server. If you don t want to run a server then there are many open servers offering free account. Server Configuration Part of doing this configuration required creating the URL https://luv.asn.au/.well-known/matrix/client with the following contents so clients know where to connect. Note that you should not setup Jitsi sections without first discussing it with the people who run the Jitsi server in question.

 
  "m.homeserver":  
    "base_url": "https://luv.asn.au"
   
  "jitsi":  
    "preferredDomain": "jitsi.perthchat.org"
   
  "im.vector.riot.jitsi":  
    "preferredDomain": "jitsi.perthchat.org"

Also the URL https://luv.asn.au/.well-known/matrix/server for other servers to know where to connect:

 
  "m.server": "luv.asn.au:8448"

If the base_url or the m.server points to a name that isn t configured then you need to add it to the web server configuration. See section 3.1 of the documentation about well known Matrix client fields [3]. The SE Linux specific parts of the configuration are to run the following commands as Bookworm and Bullseye SE Linux policy have support for Synapse:

setsebool -P httpd_setrlimit 1
setsebool -P httpd_can_network_relay 1
setsebool -P matrix_postgresql_connect 1

To configure apache you have to enable proxy mode and SSL with the command a2enmod proxy ssl proxy_http and add the line Listen 8443 to /etc/apache2/ports.conf and restart Apache. The command chmod 700 /etc/matrix-synapse should probably be run to improve security, there s no reason for less restrictive permissions on that directory. In the /etc/matrix-synapse/homeserver.yaml file the macaroon_secret_key is a random key for generating tokens. To use the matrix.org server as a trusted key server and not receive warnings put the following line in the config file:

suppress_key_server_warning: true

A line like the following is needed to configure the baseurl:

public_baseurl: https://luv.asn.au:8448/

To have Synapse directly accept port 8448 connections you have to change bind_addresses in the first section of listeners to the global listen IPv6 and IPv4 addresses. The registration_shared_secret is a password for adding users. When you have set that you can write a shell script to add new users such as:

#!/bin/bash
# usage: matrix_new_user USER PASS
synapse_register_new_matrix_user -u $1 -p $2 -a -k THEPASSWORD

You need to set tls_certificate_path and tls_private_key_path to appropriate values, usually something like the following:

tls_certificate_path: "/etc/letsencrypt/live/www.luv.asn.au-0001/fullchain.pem"
tls_private_key_path: "/etc/letsencrypt/live/www.luv.asn.au-0001/privkey.pem"

For the database section you need something like the following which matches your PostgreSQL setup:

  name: "psycopg2"
  args:
    user: WWWWWW
    password: XXXXXXX
    database: YYYYYYY
    host: ZZZZZZ
    cp_min: 5
    cp_max: 10

You need to run psql commands like the following to set it up:

create role WWWWWW login password 'XXXXXXX';
create database YYYYYYY with owner WWWWWW ENCODING 'UTF8' LOCALE 'C' TEMPLATE 'template0';

For the Apache configuration you need something like the following for the port 8448 web server:

<VirtualHost *:8448>
  SSLEngine on
...
  ServerName luv.asn.au;
  AllowEncodedSlashes NoDecode
  ProxyPass /_matrix http://127.0.0.1:8008/_matrix nocanon
  ProxyPassReverse /_matrix http://127.0.0.1:8008/_matrix
  AllowEncodedSlashes NoDecode
  ProxyPass /_matrix http://127.0.0.1:8008/_matrix nocanon
  ProxyPassReverse /_matrix http://127.0.0.1:8008/_matrix
</VirtualHost>

Also you must add the ProxyPass section to the port 443 configuration (the server that is probably doing other more directly user visible things) for most (all?) end-user clients:

  ProxyPass /_matrix http://127.0.0.1:8008/_matrix nocanon

This web page can be used to test listing rooms via federation without logging in [4]. If it gives the error Can t find this server or its room list then you must set allow_public_rooms_without_auth and allow_public_rooms_over_federation to true in /etc/matrix-synapse/homeserver.yaml. The Matrix Federation Tester site [5] is good for testing new servers and for tests after network changes. Clients The Element (formerly known as Riot) client is the most common [6]. The following APT repository will allow you to install Element via apt install element-desktop on Debian/Buster.

deb https://packages.riot.im/debian/ default main

The Debian backports repository for Buster has the latest version of Quaternion, apt install quaternion should install that for you. Quaternion doesn t support end to end encryption (E2EE) and also doesn t seem to have good support for some other features like being invited to a room. My current favourite client is Schildi Chat on Android [7], which has a notification message 24*7 to reduce the incidence of Android killing it. Eventually I want to go to PinePhone or Librem 5 for all my phone use so I need to find a full featured Linux client that works on a small screen. Comparing to Jabber I plan to keep using Jabber for alerts because it really does instant messaging, it can reliably get the message to me within a matter of seconds. Also there are a selection of command-line clients for Jabber to allow sending messages from servers. When I first investigated Matrix there was no program suitable for sending messages from a script and the libraries for the protocol made it unreasonably difficult to write one. Now there is a Matrix client written in shell script [8] which might do that. But the delay in receiving messages is still a problem. Also the Matrix clients I ve tried so far have UIs that are more suited to serious chat than to quickly reading a notification message. Bridges Here is a list of bridges between Matrix and other protocols [9]. You can run bridges yourself for many different messaging protocols including Slack, Discord, and Messenger. There are also bridges run for public use for most IRC channels. Here is a list of integrations with other services [10], this is for interacting with things other than IM systems such as RSS feeds, polls, and other things. This also has some frameworks for writing bots. More Information The Debian wiki page about Matrix is good [11]. The view.matrix.org site allows searching for public rooms [12].

Vasudev Kamath: Using LUKS-Encrypted USB Stick with TPM2 Integration

I use a LUKS-encrypted USB stick to store my GPG and SSH keys, which acts as a backup and portable key setup when working on different laptops. One inconvenience with LUKS-encrypted USB sticks is that you need to enter the password every time you want to mount the device, either through a Window Manager like KDE or using the cryptsetup luksOpen command. Fortunately, many laptops nowadays come equipped with TPM2 modules, which can be utilized to automatically decrypt the device and subsequently mount it. In this post, we'll explore the usage of systemd-cryptenroll for this purpose, along with udev rules and a set of scripts to automate the mounting of the encrypted USB. First, ensure that your device has a TPM2 module. You can run the following command to check:

sudo journalctl -k --grep=tpm2

The output should resemble the following:

Jul 08 18:57:32 bhairava kernel: ACPI: SSDT 0x00000000BBEFC000 0003C6 (v02
LENOVO Tpm2Tabl 00001000 INTL 20160422) Jul 08 18:57:32 bhairava kernel:
ACPI: TPM2 0x00000000BBEFB000 000034 (v03 LENOVO TP-R0D 00000830
PTEC 00000002) Jul 08 18:57:32 bhairava kernel: ACPI: Reserving TPM2 table
memory at [mem 0xbbefb000-0xbbefb033]

You can also use the systemd-cryptenroll command to check for the availability of a TPM2 device on your laptop:

systemd-cryptenroll --tpm2-device=list

The output will be something like following:

blog git:(master) systemd-cryptenroll --tpm2-device=list
PATH        DEVICE      DRIVER
/dev/tpmrm0 MSFT0101:00 tpm_tis
   blog git:(master)

Next, ensure that you have connected your encrypted USB device. Note that systemd-cryptenroll only works with LUKS2 and not LUKS1. If your device is LUKS1-encrypted, you may encounter an error while enrolling the device, complaining about the LUKS2 superblock not found. To determine if your device uses a LUKS1 header or LUKS2, use the cryptsetup luksDump <device> command. If it is LUKS1, the header will begin with:

LUKS header information for /dev/sdb1
Version:        1
Cipher name:    aes
Cipher mode:    xts-plain64
Hash spec:      sha256
Payload offset: 4096

Converting from LUKS1 to LUKS2 is a simple process, but for safety, ensure that you backup the header using the cryptsetup luksHeaderBackup command. Once backed up, use the following command to convert the header to LUKS2:

sudo cryptsetup convert --type luks2 /dev/sdb1

After conversion, the header will look like this:

Version:        2
Epoch:          4
Metadata area:  16384 [bytes]
Keyslots area:  2064384 [bytes]
UUID:           000b2670-be4a-41b4-98eb-9adbd12a7616
Label:          (no label)
Subsystem:      (no subsystem)
Flags:          (no flags)

The next step is to enroll the new LUKS key for the encrypted device using systemd-cryptenroll. Run the following command:

sudo systemd-cryptenroll --tpm2-device=/dev/tpmrm0 --tpm2-pcrs="0+7" /dev/sdb1

This command will prompt you to provide the existing key to unseal the device. It will then add a new random key to the volume, allowing it to be unlocked in addition to the existing keys. Additionally, it will bind this new key to PCRs 0 and 7, representing the system firmware and Secure Boot state. If there is only one TPM device on the system, you can use --tpm2-device=auto to automatically select the device. To confirm that the new key has been enrolled, you can dump the LUKS configuration and look for a systemd-tpm2 token entry, as well as an additional entry in the Keyslots section. To test the setup, you can use the /usr/lib/systemd/systemd-cryptsetup command. Additionally, you can check if the device is unsealed by using lsblk:

sudo /usr/lib/systemd/systemd-cryptsetup attach GPG_USB "/dev/sdb1" - tpm2-device=auto
lsblk

The lsblk command should display the unsealed and mounted device, like this:

NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
sda           8:0    0 223.6G  0 disk
 sda1        8:1    0   976M  0 part  /boot/efi
 sda2        8:2    0 222.6G  0 part
   root    254:0    0 222.6G  0 crypt /
sdb           8:16   1   7.5G  0 disk
 sdb1        8:17   1   7.5G  0 part
   GPG_USB 254:1    0   7.5G  0 crypt /media/vasudev/GPG_USB

Auto Mounting the device Now that we have solved the initial problem of unsealing the USB device using TPM2 instead of manually entering the key, the next step is to automatically mount the device upon insertion and remove the mapping when the device is removed. This can be achieved using the following udev rules:

ACTION=="add", KERNEL=="sd*", ENV DEVTYPE =="partition", ENV ID_BUS =="usb", ENV SYSTEMD_WANTS ="mount-gpg-usb@$env DEVNAME .service"
ACTION=="remove", KERNEL=="sd*", ENV DEVTYPE =="partition", ENV ID_BUS =="usb", RUN+="/usr/local/bin/umount_enc_usb.sh '%E ID_FS_UUID '"

When a device is added, a systemd service is triggered to mount the device at a specific location. Initially, I used a script with the RUN directive, but it resulted in an exit code of 32. This might be due to systemd-cryptsetup taking some time to return, causing udev to time out. To address this, I opted to use a systemd service instead. For device removal, even though the physical device is no longer present, the mapping may still remain, causing issues upon reinsertion. To resolve this, I created a script to close the luks mapping upon device removal. Below are the systemd service and script files: mount_enc_usb.sh:

#!/bin/bash
set -x
if [[ "$#" -ne 1 ]]; then
    echo "$(basename $0) <device>"
    exit 1
fi
device_uuid="$(blkid --output udev $1   grep ID_FS_UUID=   cut -d= -f2)"
if [[ "$device_uuid" == 000b2670-be4a-41b4-98eb-9adbd12a7616 ]]; then
    # Found our device, let's trigger systemd-cryptsetup
    /usr/lib/systemd/systemd-cryptsetup attach GPG_USB "$1" - tpm2-device=auto
    [[ -d /media/vasudev/GPG_USB ]]   (mkdir -p /media/vasudev/GPG_USB/ && chown vasudev:vasudev /media/vasudev/GPG_USB)
    mount /dev/mapper/GPG_USB /media/vasudev/GPG_USB
else
    echo "Not the interested device. Ignoring."
    exit 0
fi

umount_enc_usb.sh:

#!/bin/bash
if [[ "$#" -ne 1 ]]; then
  echo "$(basename $0) <fsuuid>"
  exit 1
fi
if [[ "$1" == "000b2670-be4a-41b4-98eb-9adbd12a7616" ]]; then
  # Our device is removed, let's close the luks mapping
  [[ -e /dev/mapper/GPG_USB ]] && cryptsetup luksClose /dev/mapper/GPG_USB
else
  echo "Not our device."
  exit 0
fi

mount-gpg-usb@.service:

[Unit]
Description=Mount the encrypted USB device service
[Service]
Type=simple
ExecStart=/usr/local/bin/mount_enc_usb.sh

With this setup, plugging in the USB device will automatically unseal and mount it, and upon removal, the luks mapping will be closed.

Note

This can be even done for LUKS2 encrypted root disk but will need some tweaking in initramfs.

References

8 July 2023

Russell Coker: Sandboxing Phone Apps

As a follow up to Wayland [1]: A difficult problem with Linux desktop systems (which includes phones and tablets) is restricting application access so that applications can t mess with each other s data or configuration but also allowing them to share data as needed. This has been mostly solved for Android but that involved giving up all legacy Linux apps. I think that we need to get phones capable of running a full desktop environment and having Android level security on phone apps and regular desktop apps. My interest in this is phones running Debian and derivatives such as PureOS. But everything I describe in this post should work equally well for all full featured Linux distributions for phones such as Arch, Gentoo, etc and phone based derivatives of those such as Manjaro. It may be slightly less applicable to distributions such as Alpine Linux and it s phone derivative PostmarketOS, I would appreciate comments from contributors to PostmarketOS or Alpine Linux about this. I ve investigated some of the ways of solving these problems. Many of the ways of doing this involves namespaces. The LWN articles about namespaces are a good background to some of these technologies [2]. The LCA keynote lecture Containers aka crazy user space fun by Jess Frazelle has a good summary of some of the technology [3]. One part that I found particularly interesting was the bit about recognising the container access needed at compile time. This can also be part of recognising bad application design at compile time, it s quite common for security systems to flag bad security design in programs. Firejail To sandbox applications you need to have some method of restricting what they do, this means some combination of namespaces and similar features. Here s an article on sandboxing with firejail [4]. Firejail uses namespaces, seccomp-bpf, and capabilities to restrict programs. It allows bind mounts if run as root and if not run as root it can restrict file access by name and access to networking, system calls, and other features. It has a convenient learning mode that generates policy for you, so if you have a certain restricted set of tasks that an application is to perform you can run it once and then force it to do only the same operations in future. I recommend that everyone who is still reading at this point try out firejail. Here s an example of what you can do:

# create a profile
firejail --build=xterm.profile xterm
# now this run can only do what the previous run did
firejail --profile=xterm.profile xterm

Note that firejail is SETUID root so can potentially reduce system security and it has had security issues in the past. In spite of that it can be good for allowing a trusted user to run programs with less access to the system. Also it is a good way to start learning about such things. I don t think it s a good solution for what I want to do. But I won t rule out the possibility of using it at some future time for special situations. Bubblewrap I tried out firejail with the browser Epiphany (Debian package epiphany-browser) on my Librem5, but that didn t work as Epiphany uses /usr/bin/bwrap (bubblewrap) for it s internal sandboxing (here is an informative blog post about the history of bubblewrap AKA xdg-app-helper which was developed as part of flatpak [5]). The Epiphany bubblewrap sandbox is similar to the situation with Chrome/Chromium which have internal sandboxing that s incompatible with firejail. The firejail man page notes that it s not compatible with Snap, Flatpack, and similar technologies. One issue this raises is that we can t have a namespace based sandboxing system applied to all desktop apps with no extra configuration as some desktop apps won t work with it. Bubblewrap requires setting kernel.unprivileged_userns_clone=1 to run as non-root (IE provide the normal and expected functionality) which potentially reduces system security. Here is an example of a past kernel bug that was exploitable by creating a user namespace with CAP_SYS_ADMIN [6]. But it s the default in recent Debian kernels which means that the issues have been reviewed and determined to be a reasonable trade-off and also means that many programs will use the feature and break if it s disabled. Here is an example of how to use Bubblewrap on Debian, after installing the bubblewrap run the following command. Note that the new-session option (to prevent injecting characters in the keyboard buffer with TIOCSTI) makes the session mostly unusable for a shell.

bwrap --ro-bind /usr /usr --symlink usr/lib64 /lib64 --symlink usr/lib /lib --proc /proc --dev /dev --unshare-pid --die-with-parent bash

Here is an example of using Bubblewrap to sandbox the game Warzone2100 running with Wayland/Vulkan graphics and Pulseaudio sound.

bwrap --bind $HOME/.local/share/warzone2100 $HOME/.local/share/warzone2100 --bind /run/user/$UID/pulse /run/user/$UID/pulse --bind /run/user/$UID/wayland-0 /run/user/$UID/wayland-0 --bind /run/user/$UID/wayland-0.lock /run/user/$UID/wayland-0.lock --ro-bind /usr /usr --symlink usr/bin /bin --symlink usr/lib64 /lib64 --symlink usr/lib /lib --proc /proc --dev /dev --unshare-pid --dev-bind /dev/dri /dev/dri --ro-bind $HOME/.pulse $HOME/.pulse --ro-bind $XAUTHORITY $XAUTHORITY --ro-bind /sys /sys --new-session --die-with-parent warzone2100

Here is an example of using Bubblewrap to sandbox the Debian bug reporting tool reportbug

bwrap --bind /tmp /tmp --ro-bind /etc /etc --ro-bind /usr /usr --ro-bind /var/lib/dpkg /var/lib/dpkg --symlink usr/sbin /sbin --symlink usr/bin /bin --symlink usr/lib64 /lib64 --symlink usr/lib /lib --symlink /usr/lib32 /lib32 --symlink /usr/libx32 /libx32 --proc /proc --dev /dev --die-with-parent --unshare-ipc --unshare-pid reportbug

Here is an example shell script to wrap the build process for Debian packages. This needs to run with unshare-user and specifying the UID as 0 because fakeroot doesn t work in the container, I haven t worked out why but doing it through the container is a better method anyway. This script shares read-write the parent of the current directory as the Debian build process creates packages and metadata files in the parent directory. This will prevent the automatic signing scripts which is a feature not a bug, so after building packages you have to sign the .changes file with debsign. One thing I just learned is that the Debian build system Sbuild can use chroots for building packages for similar benefits [7]. Some people believe that sbuild is the correct way of doing it regardless of the chroot issue. I think it s too heavy-weight for most of my Debian package building, but even if I had been convinced I d still share the information about how to use bwrap as Debian is about giving users choice.

#!/bin/bash
set -e
BUILDDIR=$(realpath $(pwd)/..)
exec bwrap --bind /tmp /tmp --bind $BUILDDIR $BUILDDIR --ro-bind /etc /etc --ro-bind /usr /usr --ro-bind /var/lib/dpkg /var/lib/dpkg --symlink usr/bin /bin --symlink usr/lib64 /lib64 --symlink usr/lib /lib --proc /proc --dev /dev --die-with-parent --unshare-user --unshare-ipc --unshare-net --unshare-pid --new-session --uid 0 --gid 0 $@

Here is an informative blog post about using Bubblewrap with Seccomp (BPF) [8]. In a future post I ll write about how to get this sort of thing going but I ll just leave the URL here for people who want to do it on their own. The source for the flatpak-run program is the only example I could find of using Seccomp with Bubblewrap [9]. A lot of that code is worth copying for application sandboxing, maybe the entire program. Unshare The unshare command from the util-linux package has a large portion of the Bubblewrap functionality. The things that it doesn t do like creating a new session can be done by other utilities. Here is an example of creating a container with unshare and then using cgroups with it [10]. systemd --user Recent distributions have systemd support for running a user session, the Arch Linux Wiki has a good description of how this works [11]. The units for a user are .service files stored in /usr/lib/systemd/user/ (distribution provided), ~/.local/share/systemd/user/ (user installed applications in debian a link to ~/.config/systemd/user/), ~/.config/systemd/user/ (for manual user config), and /etc/systemd/user/ (local sysadmin provided) Here are some example commands for manipulating this:

# show units running for the current user
systemctl --user
# show status of one unit
systemctl --user status kmail.service
# add an environment variable to the list for all user units
systemctl --user import-environment XAUTHORITY
# start a user unit
systemctl --user start kmail.service
# analyse security for all units for the current user
systemd-analyze --user security
# analyse security for one unit
systemd-analyze --user security kmail.service

Here is a test kmail.service file I wrote to see what could be done for kmail, I don t think that kmail is the app most needing to be restricted it is in more need of being protected from other apps but it still makes a good test case. This service file took it from the default risk score of 9.8 (UNSAFE) to 6.3 (MEDIUM) even though I was getting the error code=exited, status=218/CAPABILITIES when I tried anything that used capabilities (apparently due to systemd having some issue talking to the kernel).

[Unit]
Description=kmail
[Service]
ExecStart=/usr/bin/kmail
# can not limit capabilities (code=exited, status=218/CAPABILITIES)
#CapabilityBoundingSet=~CAP_SYS_TIME CAP_SYS_PACCT CAP_KILL CAP_WAKE_ALARM CAP_DAC_OVERRIDE CAP_DAC_READ_SEARCH CAP_FOWNER CAP_IPC_OWNER CAP_LINUX_IMMUTABLE CAP_IPC_LOCK CAP_SYS_MODULE CAP_SYS_TTY_CONFIG CAP_SYS_BOOT CAP_SYS_CHROOT CAP_BLOCK_SUSPEND CAP_LEASE CAP_MKNOD CAP_CHOWN CAP_FSETID CAP_SETFCAP CAP_SETGID CAP_SETUID CAP_SETPCAP CAP_SYS_RAWIO CAP_SYS_PTRACE CAP_SYS_NICE CAP_SYS_RESOURCE CAP_NET_ADMIN CAP_NET_BIND_SERVICE CAP_NET_BROADCAST CAP_NET_RAW CAP_SYS_ADMIN CAP_SYSLOG
# also 218 for ProtectKernelModules PrivateDevices ProtectKernelLogs ProtectClock
# MemoryDenyWriteExecute stops it displaying message content (bad)
# needs @resources and @mount to startup
# needs @privileged to display message content
SystemCallFilter=~@cpu-emulation @debug @raw-io @reboot @swap @obsolete
SystemCallArchitectures=native
UMask=077
NoNewPrivileges=true
ProtectControlGroups=true
PrivateMounts=false
RestrictNamespaces=~user pid net uts mnt cgroup ipc
RestrictSUIDSGID=true
ProtectHostname=true
LockPersonality=true
ProtectKernelTunables=true
RestrictAddressFamilies=~AF_PACKET
RestrictRealtime=true
ProtectSystem=strict
ProtectProc=invisible
PrivateUsers=true
[Install]

When I tried to use the TemporaryFileSystem=%h directive (to make the home directory a tmpfs the most basic step in restricting what a regular user application can do) I got the error (code=exited, status=226/NAMESPACE) . So I don t think the systemd user setup competes with bubblewrap for restricting user processes. But if anyone else can start where I left off and go further then that will be interesting. Systemd-run The following shell script runs firefox as a dynamic user via systemd-run, running this asks for the root password and any mechanism for allowing that sort of thing opens potential security holes. So at this time while it s an interesting feature I don t think it is suitable for running regular applications on a phone or Linux desktop.

#!/bin/bash
# systemd-run Firefox with DynamicUser and home directory.
#
# Run as a non-root user.
# Or, run as root and change $USER below.
SANDBOX_MINIMAL=(
    --property=DynamicUser=1
    --property=StateDirectory=openstreetmap
    # --property=RootDirectory=/debian_sid
)
SANDBOX_X11=(
    # Sharing Xorg always defeats security, regardless of any sandboxing tech,
    # but the config is almost ready for Wayland, and there's Xephyr.
#    --property=LoadCredential=.ICEauthority:/home/$USER/.ICEauthority
    --property=LoadCredential=.Xauthority:/home/$USER/.Xauthority
    --property=Environment=DISPLAY=:0
)
SANDBOX_FIREFOX=(
    # hardware-accelerated rendering
    --property=BindPaths=/dev/dri
    # webcam
    # --property=SupplementaryGroups=video
)
systemd-run  \
    "$ SANDBOX_MINIMAL[@] "  "$ SANDBOX_X11[@] " "$ SANDBOX_FIREFOX[@] " \
    bash -c '
        export XAUTHORITY="$CREDENTIALS_DIRECTORY"/.Xauthority
        export ICEAUTHORITY="$CREDENTIALS_DIRECTORY"/.ICEauthority
        export HOME="$STATE_DIRECTORY"/home
        firefox --no-remote about:blank
    '

Qubes OS Here is an interesting demo video of QubesOS [12] which shows how it uses multiple VMs to separate different uses. Here is an informative LCA presentation about Qubes which shows how it asks the user about communication between VMs and access to hardware [13]. I recommend that everyone who hasn t seen Qubes in operation watch the first video and everyone who isn t familiar with the computer science behind it watch the second video. Qubes appears to be a free software equivalent to NetTop as far as I can tell without ever being able to use NetTop. I don t think Qubes is a good match for my needs in general use and it definitely isn t a good option for phones (VMs use excessive CPU on phones). But it s methods for controlling access have some ideas that are worth copying. File Access XDG Desktop Portal One core issue for running sandboxed applications is to allow them to access files permitted by the user but no other files. There are two main parts to this problem, the easier one is to have each application have it s own private configuration directory which can be addressed by bind mounts, MAC systems, running each application under a different UID or GID, and many other ways. The hard part of file access is to allow the application to access random files that the user wishes. For example I want my email program, IM program, and web browser to be able to save files and also to be able to send arbitrary files via email, IM, and upload to web sites. But I don t want one of those programs to be able to access all the files from the others if it s compromised. So only giving programs access to arbitrary files when the user chooses such a file makes sense. There is a package xdg-desktop-portal which provides a dbus interface for opening files etc for a sandboxed application [14]. This portal has backends for KDE, GNOME, and Wayland among others which allow the user to choose which file or files the application may access. Chrome/Chromium is one well known program that uses the xdg-desktop-portal and does it s own sandboxing. To use xdg-desktop-portal an application must be modified to use that interface instead of opening files directly, so getting this going with all Internet facing applications will take some work. But the documentation notes that the portal API gives a consistent user interface for operations such as opening files so it can provide benefits even without a sandboxed environment. This technology was developed for Flatpak and is now also used for Snap. It also has a range of APIs for accessing other services [15]. Flatpak Flatpack is a system for distributing containerised versions of applications with some effort made to improve security. Their development of bubblewrap and xdg-desktop-portal is really good work. However the idea of having software packaged with all libraries it needs isn t a good one, here s a blog post covering some of the issues [16]. The web site flatkill.org has been registered to complain about some Flatpak problems [17]. They have some good points about the approach that Flatpak project developers have taken towards some issues. They also make some points about the people who package software not keeping up to date with security fixes and not determining a good security policy for their pak. But this doesn t preclude usefully using parts of the technology for real security benefits. If parts of Flatpak like Bubblewrap and xdg-portal are used with good security policies on programs that are well packaged for a distribution then these issues would be solved. The Flatpak app author s documentation about package requirements [18] has an overview of security features that is quite reasonable. If most paks follow that then it probably isn t too bad. I reviewed the manifests of a few of the recent paks and they seemed to have good settings. In the amount of time I was prepared to spend investigating this I couldn t find evidence to support the Flatkill claim about Flatpaks granting obviously inappropriate permissions. But the fact that the people who run Flathub decided to put a graph of installs over time on the main page for each pak while making the security settings only available by clicking the Manifest github link, clicking on a JSON or YAML file, and then searching for the right section in that shows where their priorities lie. The finish-args section of the Flatpak manifest (the section that describes the access to the system) seems reasonably capable and not difficult for users to specify as well as being in common use. It seems like it will be easy enough to take some code from Flatpak for converting the finish-args into Bubblewrap parameters and then use the manifest files from Flathub as a starting point for writing application security policy for Debian packages. Snap Snap is developed by Canonical and seems like their version of Flatpak with some Docker features for managing versions, the Getting Started document is worth reading [19]. They have Connections between different snaps and the system where a snap registers a plug that connects to a socket which can be exposed by the system (EG the camera socket) or another snap. The local admin can connect and disconnect them. The connections design reminds me of the Android security model with permitting access to various devices. The KDE Neon extension [20] has been written to add Snap support to KDE. Snap seems quite usable if you have an ecosystem of programs using it which Canonical has developed. But it has all the overheads of loopback mounts etc that you don t want on a mobile device and has the security issues of libraries included in snaps not being updated. A quick inspection of an Ubuntu 22.04 system I run (latest stable release) has Firefox 114.0.2-1 installed which includes libgcrypt.so.20.2.5 which is apparently libgcrypt 1.8.5 and there are CVEs relating to libgcrypt versions before 1.9.4 and 1.8.x versions before 1.8.8 which were published in 2021 and updated in 2022. Further investigation showed that libgcrypt came from the gnome-3-38-2004 snap (good that it doesn t require all shared objects to be in the same snap, but that it has old versions in dependencies). The gnome-3-38-2004 snap is the latest version so anyone using the Snap of Firefox seems to have no choice but to have what appears to be insecure libraries. The strict mode means that the Snap in question has no system access other than through interfaces [21]. SE Linux and Apparmor The Librem5 has Apparmor running by default. I looked into writing Apparmor policy to prevent Epiphany from accessing all files under the home directory, but that would be a lot of work. Also at least one person has given up on maintaining an Epiphany profile for Apparmor because it changes often and it s sandbox doesn t work well with Apparmor [22]. This was not a surprise to me at all, SE Linux policy has the same issues as Apparmor in this regard. The Ubuntu Security Team Application Confinement document [23] is worth reading. They have some good plans for using AppArmor as part of solving some of these problems. I plan to use SE Linux for that. Slightly Related Things One thing for the future is some sort of secure boot technology, the LCA lecture Becoming a tyrant: Implementing secure boot in embedded devices [24] has some ideas for the future. The Betrusted project seems really interesting, see Bunnie s lecture about how to create a phone size security device with custom OS [25]. The Betrusted project web page is worth reading too [26]. It would be ironic to have a phone as your main PC that is the same size as your security device, but that seems to be the logical solution to several computing problems. Whonix is a Linux distribution that has one VM for doing Tor stuff and another VM for all other programs which is only allowed to have network access via the Tor VM [27]. Xpra does for X programs what screen/tmux do for text mode programs [28]. It allows isolating X programs from each other in ways that are difficult to impossible with a regular X session. In an ideal situation we could probably get the benefits we need with just using Wayland, but if there are legacy apps that only have X support this could help. Conclusion I think that currently the best option for confining desktop apps is Bubblewrap on Wayland. Maybe with a modified version of Flatpak-run to run it and with app modifications to use the xdg-portal interfaces as much as possible. That should be efficient enough in storage space, storage IO performance, memory use, and CPU use to run on phones while giving some significant benefits. Things to investigate are how much code from Flatpak to use, how to most efficiently do the configuration (maybe the Flatpak way because it s known and seems effective), how to test this (and have regression tests), and what default settings to use. Also BPF is a big thing to investigate.

1 July 2023

Paul Wise: FLOSS Activities June 2023

Focus This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

thermal_daemon: fix thermal_monitor build, make power group configurable, cleanups

trurl: cleanups

yt-dlp: improve acast extractor

duck: add indicator

dpkg: avoid shell

Debian BTS usertags: fix release/port usertags

Debian QA services: more debugging, update hardcoding (1 2)

Debian package uploads: foxtrotgps, sptag (1 2 3)

Debian wiki pages: BookwormGdalPerl, ChrootOnAndroid, DebianKernel/UserspaceTools, DebianOval, DesertIslandTest, DHCP_Server, DissidentTest, Exim, InstallingDebianOn/Olimex/Teres-I, InstructionSelection (1 2), PortsDocs/New, Ports (hurd-amd64, riscv64), ReleasePartyBookworm/Israel/Tel Aviv, SourcesList, StableUpdates, SuitesAndReposExtension, SystemBuildTools, Teams (Publicity/VersionRelease, ReleaseTeam/ReleaseCheckList, Webmaster/ReleaseCheckList), ThomasChung

Issues

Crash in binarycontrol.debian.net (sent on IRC)

Build failure in ddcci-dkms

Incorrect description in manpages-ro-dev

Obsolete conffile in gnome-applets-data

Feature in lintian

Review

Spam: reported 29 Debian mailing list posts

Debian wiki: RecentChanges for the month

Debian BTS usertags: changes for the month

Debian screenshots:

approved lsd stockfish x86dis

rejected libselinux1 (photo), stockfish (mobile app), php-bcmath (selfie), chromium-browser/binutils-alpha-linux-gnu (medical photo), weboob-qt (QR code)

Administration

Debian QA services: deploy changes

Debian IRC: updated #debian-live admin list

Debian mentors: investigate reason for missing package

Debian wiki: unblock IP addresses, approve accounts

Communication

Initiate discussion about Debian OpenCL ICD virtual packages

Respond to queries from Debian users and contributors on the mailing lists and IRC

Sponsors The sptag work was sponsored. All other work was done on a volunteer basis.

29 June 2023

C.J. Collier: Converting a windows install to a libvirt VM

Reduce the size of your c: partition to the smallest it can be and then turn off windows with the understanding that you will never boot this system on the iron ever again.
Boot into a netinst installer image (no GUI). hold alt and press left arrow a few times until you get to a prompt to press enter. Press enter. In this example /dev/sda is your windows disk which contains the c: partition
and /dev/disk/by-id/usb0 is the USB-3 attached SATA controller that you have your SSD attached to (please find an example attached). This SSD should be equal to or larger than the windows disk for best compatability. A photo of a USB-3 attached SATA controller

A photo of a USB-3 attached SATA controller

To find the literal path names of your detected drives you can run fdisk -l. Pay attention to the names of the partitions and the sizes of the drives to help determine which is which. Once you have a shell in the netinst installer, you should maybe be able to run a command like the following. This will duplicate the disk located at if (in file) to the disk located at of (out file) while showing progress as the status.

dd if=/dev/sda of=/dev/disk/by-id/usb0 status=progress

If you confirm that dd is available on the netinst image and the previous command runs successfully, test that your windows partition is visible in the new disk s partition table. The start block of the windows partition on each should match, as should the partition size.

fdisk -l /dev/disk/by-id/usb0
fdisk -l /dev/sda

If the output from the first is the same as the output from the second, then you are probably safe to proceed. Once you confirm that you have made and tested a full copy of the blocks from your windows drive saved on your usb disk, nuke your windows partition table from orbit.

dd if=/dev/zero of=/dev/sda bs=1M count=42

You can press alt-f1 to return to the Debian installer now. Follow the instructions to install Debian. Don t forget to remove all attached USB drives. Once you install Debian, press ctrl-alt-f3 to get a root shell. Add your user to the sudoers group:

# adduser cjac sudoers

log out

# exit

$ sudo ls

Don t forget to read the spider man advice enter your password you ll need to install virt-manager. I think this should help:

$ sudo apt-get install virt-manager libvirt-daemon-driver-qemu qemu-system-x86

insert the USB drive. You can now create a qcow2 file for your virtual machine.

$ sudo qemu-img convert -O qcow2 \
/dev/disk/by-id/usb0 \
/var/lib/libvirt/images/windows.qcow2

I personally create a volume group called /dev/vg00 for the stuff I want to run raw and instead of converting to qcow2 like all of the other users do, I instead write it to a new logical volume.

sudo lvcreate /dev/vg00 -n windows -L 42G # or however large your drive was
sudo dd if=/dev/disk/by-id/usb0 of=/dev/vg00/windows status=progress

Now that you ve got the qcow2 file created, press alt-left until you return to your GDM session. The apt-get install command above installed virt-manager, so log in to your system if you haven t already and open up gnome-terminal by pressing the windows key or moving your mouse/gesture to the top left of your screen. Type in gnome-terminal and either press enter or click/tap on the icon. I like to run this full screen so that I feel like I m in a space ship. If you like to feel like you re in a spaceship, too, press F11. You can start virt-manager from this shell or you can press the windows key and type in virt-manager and press enter. You ll want the shell to run commands such as virsh console windows or virsh list When virt-manager starts, right click on QEMU/KVM and select New.

In the New VM window, select Import existing disk image

When prompted for the path to the image, use the one we created with sudo qemu-img convert above.

Select the version of Windows you want.

Select memory and CPUs to allocate to the VM.

Tick the Customize configuration before install box

If you re prompted to enable the default network, do so now.

The default hardware layout should probably suffice. Get it as close to the underlying hardware as it is convenient to do. But Windows is pretty lenient these days about virtualizing licensed windows instances so long as they re not running in more than one place at a time.

Good luck! Leave comments if you have questions.

27 June 2023

Russell Coker: Running a ChatGPT Type System

Alpaca I ve just done some experiments running ChatGPT type programs. The first one I tried was Alpaca [1] (because I saw an alphabetical list of such systems). Here s the 4GB data file [2]. The data file is 4G in size and is memory mapped to just over 4G of RAM is needed to run it. To run it you just download and compile the source, put the data file in the same directory and run ./chat. I run it with the Bubblewrap container via the following command line:

exec bwrap --ro-bind $(pwd) $(pwd) --ro-bind /usr /usr --symlink usr/lib64 /lib64 --symlink usr/lib /lib --proc /proc --dev /dev --unshare-pid --unshare-net --die-with-parent ./chat

Here are some examples of it s work. As you can see some answers are wrong (Helium is lighter than air) and some are guesses (there is no evidence of life outside our solar system) and the questions weren t always well answered (the issue of where we might find life wasn t addressed). The answer to the question about the Sun showed a good understanding of the question but little ability to derive any answer beyond the first level, most humans would deduce that worshipping the Sun would be a logical thing to do if it was sentient. Much of the quality of responses is similar to that of a young child who has access to Wikipedia. > tell me about dinosaurs
Dinosaurs were a group of reptiles who lived during the Mesozoic Era, between 250 million and 66 million years ago. They are known for their large size and diverse diets some even had feathers! Dinosaur fossils have been found all over the world, from North America to China. Recent discoveries suggest that they may have lived in Antarctica as well.
> are hydrogen or helium balloons better?
Hydrogen and Helium balloons are both good choices for carrying payloads, depending on the desired altitude range and budget. Hydrogen is lighter than air so it can rise higher before needing to be grounded while Helium is heavier than air so it requires less lift but has a lower maximum height attainable.
> is a 486 a good cpu?
Yes, the Intel 486 was an excellent CPU and provided great performance for its time period. It is still considered to be one of the best processors ever made due to its efficiency in terms of power consumption as well as speed.
> where will we find life outside the Earth?
Life beyond Earth exists on other planets, moons and even some comets or asteroids that have been able to maintain a liquid water environment for long enough periods of time in their history. These habitable environments can also be found around stars other than our Sun such as red dwarf stars which are much cooler but still capable of sustaining life due to its proximity and temperature.
> how do I write a good blog post?
Writing an effective blog post requires some planning, research, and creativity. Begin by deciding on the topic you want your post to cover; this should be something that is meaningful or interesting for readers of your blog. Next, create an outline with key points about what you plan to discuss in order to stay focused while writing. Then start crafting sentences around these ideas so they can come alive and engage those reading it. Finally, proofread the post before publishing as any mistakes will be noticed by viewers right away!
> if the sun was a sentient being how would that change life on earth?
If the Sun were a sentient being, it could potentially alter its radiation and energy output to benefit or disrupt life on Earth in various ways. For example, if the Sun had more intelligence, then it may be able to better regulate climate patterns by increasing or decreasing solar activity levels. Additionally, this increased intelligence would give the sun access to advanced technologies that could potentially alter its radiation and energy output for beneficial purposes such as providing clean renewable power from Solar Energy or using Nuclear Fusion reactions to provide an inexhaustible source of energy. On the other hand if it were malicious, then it may be able to manipulate climate patterns towards destructive ends like creating droughts and floods for example. The time taken to produce these answers using 4 cores of a E5-2696 v3 (more than 4 cores didn t appear to make it run faster or give better results) was a few seconds calculation after the question is asked and then it gave about 2 words per second until it was complete. Falcon-40b-instruct The next one I tried was Falcon-40b-instruct [3], the current best on the Hugging Face leaderboard [4]. It has a 90G set of data files. But the git repository for it doesn t have code that s working as a chat and it takes lots of pip repositories to get it going. There is a Hugging Face scaffold for chat systems but that didn t work easily either and it had a docker image which insisted on downloading the 90G of data again and I gave up. I guess Falcon is not for people who have little Python experience. Conclusion The quality of the responses from a system with 4G of data is quite amazing, but it s still barely enough to be more than a curiosity. It s a long way from the quality of ChatGPT [5] or the phind.com service described as The AI search engine for developers [6]. I have found phind.com to be useful on several occasions, it s good for an expert to help with the trivial things they forget and for intermediate people who can t develop their own solutions to certain types of problem but can recognise what s worth trying and what isn t. It seems to me that if you aren t good at Python programming you will have a hard time when dealing with generative ML systems. Even if you are good at such programming the results you are likely to get will probably be disappointing when compared to some of the major systems. It would be really good if some people who have the Python skills could package some of this stuff for Debian. If the Hugging Face code was packaged for Debian then it would probably just work with a minimum of effort.

20 June 2023

Vasudev Kamath: Notes: Experimenting with ZRAM and Memory Over commit

Introduction The ZRAM module in the Linux kernel creates a memory-backed block device that stores its content in a compressed format. It offers users the choice of compression algorithms such as lz4, zstd, or lzo. These algorithms differ in compression ratio and speed, with zstd providing the best compression but being slower, while lz4 offers higher speed but lower compression.

Using ZRAM as Swap One interesting use case for ZRAM is utilizing it as swap space in the system. There are two utilities available for configuring ZRAM as swap: zram-tools and systemd-zram-generator. However, Debian Bullseye lacks systemd-zram-generator, making zram-tools the only option for Bullseye users. While it's possible to use systemd-zram-generator by self-compiling or via cargo, I preferred using tools available in the distribution repository due to my restricted environment.

Installation The installation process is straightforward. Simply execute the following command:

apt-get install zram-tools

Configuration The configuration involves modifying a simple shell script file /etc/default/zramswap sourced by the /usr/bin/zramswap script. Here's an example of the configuration I used:

# Compression algorithm selection
# Speed: lz4 > zstd > lzo
# Compression: zstd > lzo > lz4
# This is not inclusive of all the algorithms available in the latest kernels
# See /sys/block/zram0/comp_algorithm (when the zram module is loaded) to check
# the currently set and available algorithms for your kernel [1]
# [1]  https://github.com/torvalds/linux/blob/master/Documentation/blockdev/zram.txt#L86
ALGO=zstd
# Specifies the amount of RAM that should be used for zram
# based on a percentage of the total available memory
# This takes precedence and overrides SIZE below
PERCENT=30
# Specifies a static amount of RAM that should be used for
# the ZRAM devices, measured in MiB
# SIZE=256000
# Specifies the priority for the swap devices, see swapon(2)
# for more details. A higher number indicates higher priority
# This should probably be higher than hdd/ssd swaps.
# PRIORITY=100

I chose zstd as the compression algorithm for its superior compression capabilities. Additionally, I reserved 30% of memory as the size of the zram device. After modifying the configuration, restart the zramswap.service to activate the swap:

systemctl restart zramswap.service

Using systemd-zram-generator For Debian Bookworm users, an alternative option is systemd-zram-generator. Although zram-tools is still available in Debian Bookworm, systemd-zram-generator offers a more integrated solution within the systemd ecosystem. Below is an example of the translated configuration for systemd-zram-generator, located at /etc/systemd/zram-generator.conf:

# This config file enables a /dev/zram0 swap device with the following
# properties:
# * size: 50% of available RAM or 4GiB, whichever is less
# * compression-algorithm: kernel default
#
# This device's properties can be modified by adding options under the
# [zram0] section below. For example, to set a fixed size of 2GiB, set
#  zram-size = 2GiB .
[zram0]
zram-size = ceil(ram * 30/100)
compression-algorithm = zstd
swap-priority = 100
fs-type = swap

After making the necessary changes, reload systemd and start the systemd-zram-setup@zram0.service:

systemctl daemon-reload
systemctl start systemd-zram-setup@zram0.service

The systemd-zram-generator creates the zram device by loading the kernel module and then creates a systemd.swap unit to mount the zram device as swap. In this case, the swap file is called zram0.swap.

Checking Compression and Details To verify the effectiveness of the swap configuration, you can use the zramctl command, which is part of the util-linux package. Alternatively, the zramswap utility provided by zram-tools can be used to obtain the same output. During my testing with synthetic memory load created using stress-ng vm class I found that I can reach upto 40% compression ratio.

Memory Overcommit Another use case I was looking for is allowing the launching of applications that require more memory than what is available in the system. By default, the Linux kernel attempts to estimate the amount of free memory left on the system when user space requests more memory (vm.overcommit_memory=0). However, you can change this behavior by modifying the sysctl value for vm.overcommit_memory to 1. To demonstrate this, I ran a test using stress-ng to request more memory than the system had available. As expected, the Linux kernel refused to allocate memory, and the stress-ng process could not proceed.

free -tg                                                                                                                                                                                          (Mon,Jun19) 
                total        used        free      shared  buff/cache   available
 Mem:              31          12          11           3          11          18
 Swap:             10           2           8
 Total:            41          14          19
sudo stress-ng --vm=1 --vm-bytes=50G -t 120                                                                                                                                                       (Mon,Jun19) 
 stress-ng: info:  [1496310] setting to a 120 second (2 mins, 0.00 secs) run per stressor
 stress-ng: info:  [1496310] dispatching hogs: 1 vm
 stress-ng: info:  [1496312] vm: gave up trying to mmap, no available memory, skipping stressor
 stress-ng: warn:  [1496310] vm: [1496311] aborted early, out of system resources
 stress-ng: info:  [1496310] vm:
 stress-ng: warn:  [1496310]         14 System Management Interrupts
 stress-ng: info:  [1496310] passed: 0
 stress-ng: info:  [1496310] failed: 0
 stress-ng: info:  [1496310] skipped: 1: vm (1)
 stress-ng: info:  [1496310] successful run completed in 10.04s

By setting vm.overcommit_memory=1, Linux will allocate memory in a more relaxed manner, assuming an infinite amount of memory is available.

Conclusion ZRAM provides disks that allow for very fast I/O, and compression allows for a significant amount of memory savings. ZRAM is not restricted to just swap usage; it can be used as a normal block device with different file systems. Using ZRAM as swap is beneficial because, unlike disk-based swap, it is faster, and compression ensures that we use a smaller amount of RAM itself as swap space. Additionally, adjusting the memory overcommit settings can be beneficial for scenarios that require launching memory-intensive applications. Note: When running stress tests or allocating excessive memory, be cautious about the actual memory capacity of your system to prevent out-of-memory (OOM) situations. Feel free to explore the capabilities of ZRAM and optimize your system's memory management. Happy computing!

Reference

18 June 2023

Louis-Philippe V ronneau: Solo V2: nice but flawed

EDIT: One of my 2 keys has died. There are what seems like golden bubbles under the epoxy, over one of the chips and those were not there before. I've emailed SoloKeys and I'm waiting for a reply, but for now, I've stopped using the Solo V2 altogether :( I recently received the two Solo V2 hardware tokens I ordered as part of their crowdfunding campaign, back in March 2022. It did take them longer than advertised to ship me the tokens, but that's hardly unexpected from such small-scale, crowdfunded undertaking. I'm mostly happy about my purchase and I'm glad to get rid of the aging Tomu boards I was using as U2F tokens¹. Still, beware: I am not sure it's a product I would recommend if what you want is simply something that works. If you do not care about open-source hardware, the Solo V2 is not for you. The Good A side-by-side view of the Solo V2's top and back sides

A side-by-side view of the Solo V2's top and back sides

I first want to mention I find the Solo V2 gorgeous. I really like the black and gold color scheme of the USB-A model (which is reversible!) and it seems like a well built and solid device. I'm not afraid to have it on my keyring and I fully expect it to last a long time. An animation of the build process, showing how the PCB is assembled and then slotted into the shell

An animation of the build process, showing how the PCB is assembled and then slotted into the shell

I'm also very impressed by the modular design: the PCB sits inside a shell, which decouples the logic from the USB interface and lets them manufacture a single board for both the USB-C and USB-A models. The clear epoxy layer on top of the PCB module also looks very nice in my opinion. A picture of the Solo V2 with its silicone case on my keyring, showing the 3 capacitive buttons

A picture of the Solo V2 with its silicone case on my keyring, showing the 3 capacitive buttons

I'm also very happy the Solo V2 has capacitive touch buttons instead of physical "clicky" buttons, as it means the device has no moving parts. The token has three buttons (the gold metal strips): one on each side of the device and a third one near the keyhole. As far as I've seen, the FIDO2 functions seem to work well via the USB interface and do not require any configuration on a Debian 12 machine. I've already migrated to the Solo V2 for web-based 2FA and I am in the process of migrating to an SSH ed25519-sk key. Here is a guide I recommend if you plan on setting those up with a Solo V2. The Bad and the Ugly Sadly, the Solo V2 is far from being a perfect project. First of all, since the crowdfunding campaign is still being fulfilled, it is not currently commercially available. Chances are you won't be able to buy one directly before at least Q4 2023. I've also hit what seems to be a pretty big firmware bug, or at least, one that affects my use case quite a bit. Invoking gpg crashes the Solo V2 completely if you also have scdaemon installed. Since scdaemon is necessary to use gpg with an OpenPGP smartcard, this means you cannot issue any gpg commands (like signing a git commit...) while the Solo V2 is plugged in. Any gpg commands that queries scdaemon, such as gpg --edit-card or

gpg
--sign foo.txt

times out after about 20 seconds and leaves the token unresponsive to both touch and CLI commands. The way to "fix" this issue is to make sure scdaemon does not interact with the Solo V2 anymore, using the reader-port argument:

Plug both your Solo V2 and your OpenPGP smartcard
To get a list of the tokens scdaemon sees, run the following command: $ echo scd getinfo reader_list gpg-connect-agent --decode awk '/^D/ print $2 '
Identify your OpenPGP smartcard. For example, my Nitrokey Start is listed as 20A0:4211:FSIJ-1.2.15-43211613:0
Create a file in ~/.gnupg/scdaemon.conf with the following line reader-port $YOUR_TOKEN_ID. For example, in my case I have: reader-port 20A0:4211:FSIJ-1.2.15-43211613:0
Reload scdaemon: $ gpgconf --reload scdaemon

Although this is clearly a firmware bug², I do believe GnuPG is also partly to blame here. Let's just say I was not very surprised to have to battle scdaemon again, as I've had previous issues with it. Which leads me to my biggest gripe so far: it seems SoloKeys (the company) isn't really fixing firmware issues anymore and doesn't seems to care. The last firmware release is about a year old. Although people are experiencing serious bugs, there is no official way to report them, which leads to issues being seemingly ignored. For example, the NFC feature is apparently killing keys (!!!), but no one from the company seems to have acknowledged the issue. The same goes for my GnuPG bug, which was flagged in September 2022. For a project that mainly differentiates itself from its (superior) competition by being "Open", it's not a very good look... Although SoloKeys is still an unprofitable open source side business of its creators ³, this kind of attitude certainly doesn't help foster trust. Conclusion If you want to have a nice, durable FIDO2 token, I would suggest you get one of the many models Yubico offers. They are similarly priced, are readily commercially available, are part of a nice and maintained software ecosystem and have more features than the Solo V2 (OpenPGP support being the one I miss the most). Yubikeys are the practical option. What they are not is open-source hardware, whereas the Solo V2 is. As bunnie very well explained on his blog in 2019, it does not mean the later is inherently more trustable than the former, but it does make the Solo V2 the ideological option. Knowledge is power and it should be free. As such, tread carefully with SoloKeys, but don't dismiss them altogether: the Solo V2 is certainly functioning well enough for me.

Although U2F is still part of the FIDO2 specification, the Tomus predate this standard and were thus not fully compliant with FIDO2. So long and thanks for all the fish little boards, you've served me well!
It appears the Solo V2 shares its firmware with the Nitrokey 3, which had a similar issue a while back.
This is a direct quote from one of the Solo V2 firmware maintainers.

16 June 2023

John Goerzen: Using git-annex for Data Archiving

In my recent post about data archiving to removable media, I laid out the difference between backing up and archiving, and also said I d evaluate git-annex and dar. This post evaluates git-annex. The next will look at dar, and then I ll make a comparison post. What is git-annex? git-annex is a fantastic and versatile program that does well, it s one of those things that can do so much that it s a bit hard to describe. Its homepage says:

git-annex allows managing large files with git, without storing the file contents in git. It can sync, backup, and archive your data, offline and online. Checksums and encryption keep your data safe and secure. Bring the power and distributed nature of git to bear on your large files with git-annex.

I think the particularly interesting features of git-annex aren t actually included in that list. Among the features of git-annex that make it shine for this purpose, its location tracking is key. git-annex can know exactly which device has which file at which version at all times. Combined with its preferred content settings, this lets you very easily say things like:

I want exactly 1 copy of every file to exist within the set #1 of backup drives. Here s a drive in that set; copy to it whatever needs to be copied to satisfy that requirement.
Now I have another set of backup drives. Periodically I will swap sets offsite. Copy whatever is needed to this drive in the second set, making sure that there is 1 copy of every file within this set as well, regardless of what s in the first set.
Here s a directory I want to use to track the status of everything else. I don t want any copies at all here.

git-annex can be set to allow a configurable amount of free space to remain on a device, and it will fill it up with whatever copies are necessary up until it hits that limit. Very convenient! git-annex will store files in a folder structure that mirrors the origin folder structure, in plain files just as they were. This maximizes the ability for a future person to access the content, since it is all viewable without any special tool at all. Of course, for things like optical media, git-annex will essentially be creating what amounts to incrementals. To obtain a consistent copy of the original tree, you would still need to use git-annex to process (export) the archives. git-annex challenges In my prior post, I related some challenges with git-annex. The biggest of them quite poor performance of the directory special remote when dealing with many files has been resolved by Joey, git-annex s author! That dramatically improves the git-annex use scenario here! The fixing commit is in the source tree but not yet in a release. git-annex no doubt may still have performance challenges with repositories in the 100,000+-range, but in that order of magnitude it now looks usable. I m not sure about 1,000,000-file repositories (I haven t tested); there is a page about scalability. A few other more minor challenges remain:

git-annex doesn t really preserve POSIX attributes; for instance, permissions, symlink destinations, and timestamps are all not preserved. Of these, timestamps are the most important for my particular use case.
If your data set to archive contains Git repositories itself, these will not be included.

I worked around the timestamp issue by using the mtree-netbsd package in Debian. mtree writes out a summary of files and metadata in a tree, and can restore them. To save: mtree -c -R nlink,uid,gid,mode -p /PATH/TO/REPO -X <(echo './.git') > /tmp/spec And, after restoration, the timestamps can be applied with: mtree -t -U -e < /tmp/spec Walkthrough: initial setup To use git-annex in this way, we have to do some setup. My general approach is this:

There is a source of data that lives outside git-annex. I'll call this $SOURCEDIR.
I'm going to name the directories holding my data $REPONAME.
There will be a "coordination" git-annex repo. It will hold metadata only, and no data. This will let us track where things live. I'll call it $METAREPO.
There will be drives. For this example, I'll call their mountpoints $DRIVE01 and $DRIVE02. For easy demonstration purposes, I used a ZFS dataset with a refquota set (to observe the size handling), but I could have as easily used a LVM volume, btrfs dataset, loopback filesystem, or USB drive. For optical discs, this would be a staging area or a UDF filesystem.

Let's get started! I've set all these shell variables appropriately for this example, and REPONAME to "testdata". We'll begin by setting up the metadata-only tracking repo.



$ REPONAME=testdata

$ mkdir "$METAREPO"

$ cd "$METAREPO"

$ git init

$ git config annex.thin true

There is a sort of complicated topic of how git-annex stores files in a repo, which varies depending on whether the data for the file is present in a given repo, and whether the file is locked or unlocked. Basically, the options I use here cause git-annex to mostly use hard links instead of symlinks or pointer files, for maximum compatibility with non-POSIX filesystems such as NTFS and UDF, which might be used on these devices. thin is part of that. Let's continue:



$ git annex init 'local hub'

init local hub ok

(recording state in git...)

$ git annex wanted . "include=* and exclude=$REPONAME/*"

wanted . ok

(recording state in git...)

In a bit, we are going to import the source data under the directory named $REPONAME (here, testdata). The wanted command says: in this repository (represented by the bare dot), the files we want are matched by the rule that says eveyrthing except what's under $REPONAME. In other words, we don't want to make an unnecessary copy here. Because I expect to use an mtree file as documented above, and it is not under $REPONAME/, it will be included. Let's just add it and tweak some things.



$ touch mtree

$ git annex add mtree

add mtree

ok

(recording state in git...)

$ git annex sync

git-annex sync will change default behavior to operate on --content in a future version of git-annex. Recommend you explicitly use --no-content (or -g) to prepare for that change. (Or you can configure annex.synccontent)

commit

[main (root-commit) 6044742] git-annex in local hub

 1 file changed, 1 insertion(+)

 create mode 120000 mtree

ok

$ ls -l

total 9

lrwxrwxrwx 1 jgoerzen jgoerzen 178 Jun 15 22:31 mtree -> .git/annex/objects/pX/ZJ/...

OK! We've added a file, and it got transformed into a symlink. That's the thing I said we were going to avoid, so:



 git annex adjust --unlock-present

adjust

Switched to branch 'adjusted/main(unlockpresent)'

ok

$ ls -l

total 1

-rw-r--r-- 2 jgoerzen jgoerzen 0 Jun 15 22:31 mtree

You'll notice it transformed into a hard link (nlinks=2) file. Great! Now let's import the source data. For that, we'll use the directory special remote.



$ git annex initremote source type=directory  directory=$SOURCEDIR importtree=yes \

   encryption=none

initremote source ok

(recording state in git...)

$ git annex enableremote source directory=$SOURCEDIR

enableremote source ok

(recording state in git...)

$ git config remote.source.annex-readonly true

$ git config annex.securehashesonly true

$ git config annex.genmetadata true

$ git config annex.diskreserve 100M

$ git config remote.source.annex-tracking-branch main:$REPONAME

OK, so here we created a new remote named "source". We enabled it, and set some configuration. Most notably, that last line causes files from "source" to be imported under $REPONAME/ as we wanted earlier. Now we're ready to scan the source.



$ git annex sync

At this point, you'll see git-annex computing a hash for every file in the source directory. I can verify with du that my metadata-only repo only uses 14MB of disk space, while my source is around 4GB. Now we can see what git-annex thinks about file locations:



$ git-annex whereis   less

whereis mtree (1 copy)

        8aed01c5-da30-46c0-8357-1e8a94f67ed6 -- local hub [here]

ok

whereis testdata/[redacted] (0 copies)

  The following untrusted locations may also have copies:

        9e48387e-b096-400a-8555-a3caf5b70a64 -- [source]

failed

... many more lines ...

So remember we said we wanted mtree, but nothing under testdata, under this repo? That's exactly what we got. git-annex knows that the files under testdata can be found under the "source" special remote, but aren't in any git-annex repo -- yet. Now we'll start adding them. Walkthrough: removable drives I've set up two 500MB filesystems to represent removable drives. We'll see how git-annex works with them.



$ cd $DRIVE01

$ df -h .

Filesystem                     Size  Used Avail Use% Mounted on

acrypt/no-backup/annexdrive01  500M  1.0M  499M   1% /acrypt/no-backup/annexdrive01

$ git clone $METAREPO

Cloning into 'testdata'...

done.

$ cd $REPONAME

$ git config annex.thin true

$ git annex init "test drive #1"

$ git annex adjust --hide-missing --unlock

adjust

Switched to branch 'adjusted/main(hidemissing-unlocked)'

ok

$ git annex sync

OK, that's the initial setup. Now let's enable the source remote and configure it the same way we did before:



$ git annex enableremote source directory=$SOURCEDIR

enableremote source ok

(recording state in git...)

$ git config remote.source.annex-readonly true

$ git config remote.source.annex-tracking-branch main:$REPONAME

$ git config annex.securehashesonly true

$ git config annex.genmetadata true

$ git config annex.diskreserve 100M

Now, we'll add the drive to a group called "driveset01" and configure what we want on it:



$ git annex group . driveset01

$ git annex wanted . '(not copies=driveset01:1)'

What this does is say: first of all, this drive is in a group named driveset01. Then, this drive wants any files for which there isn't already at least one copy in driveset01. Now let's load up some files!



$ git annex sync --content

As the messages fly by from here, you'll see it mentioning that it got mtree, and then various files from "source" -- until, that is, the filesystem had less than 100MB free, at which point it complained of no space for the rest. Exactly like we wanted! Now, we need to teach $METAREPO about $DRIVE01.



$ cd $METAREPO

$ git remote add drive01 $DRIVE01/$REPONAME

$ git annex sync drive01

git-annex sync will change default behavior to operate on --content in a future version of git-annex. Recommend you explicitly use --no-content (or -g) to prepare for that change. (Or you can configure annex.synccontent)

commit

On branch adjusted/main(unlockpresent)

nothing to commit, working tree clean

ok

merge synced/main (Merging into main...)

Updating d1d9e53..817befc

Fast-forward

(Merging into adjusted branch...)

Updating 7ccc20b..861aa60

Fast-forward

ok

pull drive01

remote: Enumerating objects: 214, done.

remote: Counting objects: 100% (214/214), done.

remote: Compressing objects: 100% (95/95), done.

remote: Total 110 (delta 6), reused 0 (delta 0), pack-reused 0

Receiving objects: 100% (110/110), 13.01 KiB   1.44 MiB/s, done.

Resolving deltas: 100% (6/6), completed with 6 local objects.

From /acrypt/no-backup/annexdrive01/testdata

 * [new branch]      adjusted/main(hidemissing-unlocked) -> drive01/adjusted/main(hidemissing-unlocked)

 * [new branch]      adjusted/main(unlockpresent)        -> drive01/adjusted/main(unlockpresent)

 * [new branch]      git-annex                           -> drive01/git-annex

 * [new branch]      main                                -> drive01/main

 * [new branch]      synced/main                         -> drive01/synced/main

ok

OK! This step is important, because drive01 and drive02 (which we'll set up shortly) won't necessarily be able to reach each other directly, due to not being plugged in simultaneously. Our $METAREPO, however, will know all about where every file is, so that the "wanted" settings can be correctly resolved. Let's see what things look like now:



$ git annex whereis   less

whereis mtree (2 copies)

        8aed01c5-da30-46c0-8357-1e8a94f67ed6 -- local hub [here]

        b46fc85c-c68e-4093-a66e-19dc99a7d5e7 -- test drive #1 [drive01]

ok

whereis testdata/[redacted] (1 copy)

        b46fc85c-c68e-4093-a66e-19dc99a7d5e7 -- test drive #1 [drive01]

  The following untrusted locations may also have copies:

        9e48387e-b096-400a-8555-a3caf5b70a64 -- [source]

ok

If I scroll down a bit, I'll see the files past the 400MB mark that didn't make it onto drive01. Let's add another example drive! Walkthrough: Adding a second drive The steps for $DRIVE02 are the same as we did before, just with drive02 instead of drive01, so I'll omit listing it all a second time. Now look at this excerpt from whereis:



whereis testdata/[redacted] (1 copy)

        b46fc85c-c68e-4093-a66e-19dc99a7d5e7 -- test drive #1 [drive01]


  The following untrusted locations may also have copies:

        9e48387e-b096-400a-8555-a3caf5b70a64 -- [source]

ok

whereis testdata/[redacted] (1 copy)

        c4540343-e3b5-4148-af46-3f612adda506 -- test drive #2 [drive02]

  The following untrusted locations may also have copies:

        9e48387e-b096-400a-8555-a3caf5b70a64 -- [source]

ok

Look at that! Some files on drive01, some on drive02, some neither place. Perfect! Walkthrough: Updates So I've made some changes in the source directory: moved a file, added another, and deleted one. All of these were copied to drive01 above. How do we handle this? First, we update the metadata repo:



$ cd $METAREPO

$ git annex sync

$ git annex dropunused all

OK, this has scanned $SOURCEDIR and noted changes. Let's see what whereis says:



$ git annex whereis   less

...

whereis testdata/cp (0 copies)

  The following untrusted locations may also have copies:

        9e48387e-b096-400a-8555-a3caf5b70a64 -- [source]

failed

whereis testdata/file01-unchanged (1 copy)

        b46fc85c-c68e-4093-a66e-19dc99a7d5e7 -- test drive #1 [drive01]

  The following untrusted locations may also have copies:

        9e48387e-b096-400a-8555-a3caf5b70a64 -- [source]

ok

So this looks right. The file I added was a copy of /bin/cp. I moved another file to one named file01-unchanged. Notice that it realized this was a rename and that the data still exists on drive01. Well, let's update drive01.



$ cd $DRIVE01/$REPONAME

$ git annex sync --content

Looking at the testdata/ directory now, I see that file01-unchanged has been renamed, the deleted file is gone, but cp isn't yet here -- probably due to space issues; as it's new, it's undefined whether it or some other file would fill up free space. Let's work along a few more commands.



$ git annex get --auto

$ git annex drop --auto

$ git annex dropunused all

And now, let's make sure metarepo is updated with its state.



$ cd $METAREPO

$ git annex sync

We could do the same for drive02. This is how we would proceed with every update. Walkthrough: Restoration Now, we have bare files at reasonable locations in drive01 and drive02. But, to generate a consistent restore, we need to be able to actually do an export. Otherwise, we may have files with old names, duplicate files, etc. Let's assume that we lost our source and metadata repos and have to restore from scratch. We'll make a new $RESTOREDIR. We'll begin with drive01 since we used it most recently.



$ mv $METAREPO $METAREPO.disabled

$ mv $SOURCEDIR $SOURCEDIR.disabled

$ git clone $DRIVE01/$REPONAME $RESTOREDIR

$ cd $RESTOREDIR

$ git config annex.thin true

$ git annex init "restore"

$ git annex adjust --hide-missing --unlock

Now, we need to connect the drive01 and pull the files from it.



$ git remote add drive01 $DRIVE01/$REPONAME

$ git annex sync --content

Now, repeat with drive02:



$ git remote add drive02 $DRIVE02/$REPONAME

$ git annex sync --content

Now we've got all our content back! Here's what whereis looks like:



whereis testdata/file01-unchanged (3 copies)

        3d663d0f-1a69-4943-8eb1-f4fe22dc4349 -- restore [here]

        9e48387e-b096-400a-8555-a3caf5b70a64 -- source

        b46fc85c-c68e-4093-a66e-19dc99a7d5e7 -- test drive #1 [origin]

ok

...

I was a little surprised that drive01 didn't seem to know what was on drive02. Perhaps that could have been remedied by adding more remotes there? I'm not entirely sure; I'd thought would have been able to do that automatically. Conclusions I think I have demonstrated two things: First, git-annex is indeed an extremely powerful tool. I have only scratched the surface here. The location tracking is a neat feature, and being able to just access the data as plain files if all else fails is nice for future users. Secondly, it is also a complex tool and difficult to get right for this purpose (I think much easier for some other purposes). For someone that doesn't live and breathe git-annex, it can be hard to get right. In fact, I'm not entirely sure I got it right here. Why didn't drive02 know what files were on drive01 and vice-versa? I don't know, and that reflects some kind of misunderstanding on my part about how metadata is synced; perhaps more care needs to be taken in restore, or done in a different order, than I proposed. I initially tried to do a restore by using git annex export to a directory special remote with exporttree=yes, but I couldn't ever get it to actually do anything, and I don't know why. These two cut against each other. On the one hand, the raw accessibility of the data to someone with no computer skills is unmatched. On the other hand, I'm not certain I have the skill to always prepare the discs properly, or to do a proper consistent restore.

Next.

Previous.